<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>AdamFranco.com &#187; PHP</title>
	<atom:link href="http://www.adamfranco.com/tag/php/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.adamfranco.com</link>
	<description>Musings, projects, software, and photography.</description>
	<lastBuildDate>Thu, 06 Oct 2011 19:54:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Mirroring a Subversion repository on Github</title>
		<link>http://www.adamfranco.com/2010/12/05/mirroring-a-subversion-repository-on-github/</link>
		<comments>http://www.adamfranco.com/2010/12/05/mirroring-a-subversion-repository-on-github/#comments</comments>
		<pubDate>Sun, 05 Dec 2010 05:30:12 +0000</pubDate>
		<dc:creator>Adam</dc:creator>
				<category><![CDATA[Computers and Technology]]></category>
		<category><![CDATA[Work/Professional]]></category>
		<category><![CDATA[Git]]></category>
		<category><![CDATA[git-svn]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[source-control]]></category>
		<category><![CDATA[Subversion]]></category>

		<guid isPermaLink="false">http://www.adamfranco.com/?p=445</guid>
		<description><![CDATA[For the past few months I have been doing a lot of work on the phpCAS library, mostly to improve the community trunk of phpCAS so that I wouldn&#8217;t have to maintain our own custom fork with support for the CAS attribute format we use at Middlebury College. The phpCAS project lead, Joachim Fritschi, has [...]]]></description>
			<content:encoded><![CDATA[<p>For the past few months I have been doing a lot of <a href="http://www.ohloh.net/p/phpcas/contributors/290013371731193">work</a> on the <a href="https://wiki.jasig.org/display/CASC/phpCAS">phpCAS library</a>, mostly to improve the community trunk of phpCAS so that I wouldn&#8217;t have to maintain our own custom fork with support for the <a href="https://issues.jasig.org/browse/PHPCAS-88">CAS attribute</a> format we use at Middlebury College. The phpCAS project lead, Joachim Fritschi, has been great to work with and I&#8217;ve had a blast helping out with the project.</p>
<p>The tooling has involved a few challenges however, since <a href="http://www.jasig.org/">Jasig</a> (the organization that hosts the <a href="http://www.jasig.org/cas">CAS</a> and phpCAS projects) uses <a href="http://subversion.apache.org/">Subversion</a> for its source-code repositories and we use <a href="http://git-scm.com/">Git</a> for all of our projects. Now, I could just suck it up and use Subversion when doing phpCAS development, but there are a few reasons I don&#8217;t:</p>
<ol>
<li>We make use of <a href="http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#submodules">Git submodules</a> to include phpCAS along with the source-code of our applications, necessitating the use of a public Git repository that includes phpCAS.</li>
<li>The <a href="http://www.kernel.org/pub/software/scm/git/docs/git-svn.html">git-svn</a> tools allow me to use git on my end to work with a Subversion repository, which is great because&#8230;</li>
<li>I find that Git&#8217;s fast history browsing and searching make troubleshooting and bug fixing much easier than any other tools I&#8217;ve used.</li>
</ol>
<p>For the past two years I have been using git-svn to work with the phpCAS repository and every so often pushing changes up to a <a href="https://github.com/adamfranco/phpcas/">public Git repository on GitHub</a>. Our applications reference this repository as a submodule when they need to make use of phpCAS. Now that I&#8217;ve been doing more work on phpCAS (and am more interested in keeping our applications using up-to-date versions), I&#8217;ve decided to automate the process of mirroring the Subversion repository on GitHub. Read on for details of how I&#8217;ve set this up and the scripts for keeping the mirror in sync.</p>
<p><span id="more-445"></span></p>
<p>On my development server I have a git repository I&#8217;ve cloned from the Jasig Subversion repository via:</p>
<pre>git svn clone --stdlayout https://source.jasig.org/cas-clients/phpcas/</pre>
<p>I use this repository for my phpCAS development and am continually using <code>git svn rebase</code> and <code>git svn dcommit</code> to update my branches from Subversion and commit changes back to the Subversion repository.</p>
<p>My goal was to fetch from Subversion and push all of the branches and tags from the Subversion repository to GitHub while ignoring any private branches or un-dcommited changes I might have kicking about my development repository.</p>
<p><strong>Step 1: Add the GitHub repository as a remote</strong></p>
<pre>git remote add github git@github.com:adamfranco/phpcas.git</pre>
<p><strong>Step 2: Fetch the latest changes from the svn repository</strong><br />
To do this, I just needed to run <code>git svn fetch</code> to import commits from the Subversion repository into my Git repository.</p>
<p><strong>Step 3: Make Git tags for Subversion tag-branches</strong><br />
I may be doing something wrong, but it seems that Subversion tags come through <code>git svn</code> as git branches rather than as git &#8220;tag&#8221; objects. Basically they are a branch with a single commit that just adds the tag message, but no content change. Using <code>git show</code> I found I could grab the parent id, message, and other metadata from the &#8220;tag-branch&#8221;, then feed that into <code>git tag</code> to create actual tag objects in the git repository.</p>
<p style="text-align: center;"><a href="http://www.adamfranco.com/files/2010/12/git-svn_tags.png"><img class="aligncenter " title="git-svn_tags" src="http://www.adamfranco.com/files/2010/12/git-svn_tags.png" alt="" width="100%" /></a></p>
<p><strong>Step 4: Push subversion branches and newly created tags to GitHub</strong><br />
When called with no parameters <code>git push</code> will push all branches that have matching names in both the source and the destination repository. This wasn&#8217;t going to work for me since I want to only automatically push the branch-state that exists in subversion (not any un-dcommitted changes in my Git repository) and want to create mirrors of any new branches that appear in the Subversion repository. To accomplish this I needed to specify every branch individually. I determined the list of branches via:</p>
<pre>git branch -r | grep -v '/' | grep -v trunk</pre>
<p>then looped through them and appended them to the hard-coded mapping between the svn &#8220;trunk&#8221; and the GitHub &#8220;master&#8221;:</p>
<pre>$cmd = 'git push --tags github refs/remotes/trunk:refs/heads/master ';
foreach ($svnBranches as $branch) {
	$cmd .= 'refs/remotes/'.$branch.':refs/heads/'.$branch.' ';
}</pre>
<p><strong>All together: <code>update_github_phpcas</code></strong><br />
Below is a script which performs the tasks above. I&#8217;ve added it to my crontab so that it runs every half-hour and keeps my GitHub repository in-sync with the Jasig Subversion repository.</p>
<div style="display: block; border: 1px dotted; padding: 5px;"><code><span style="color: #000000;"><br />
#!/usr/local/bin/php<br />
<span style="color: #0000bb;">&lt;?php<br />
</span><span style="color: #ff8000;">/**<br />
* Script to mirror a Subversion repository on GitHub or another public Git repository.<br />
*<br />
* Author: Adam Franco (afranco@middlebury.edu)<br />
* Date: 2010-12-04<br />
* License: GNU General Public License (GPL) version 2 or later.<br />
*/&nbsp;</p>
<p><span style="color: #0000bb;">chdir</span><span style="color: #007700;">(</span><span style="color: #dd0000;">'/home/afranco/private_html/phpcas/'</span><span style="color: #007700;">);</span></p>
<p><span style="color: #ff8000;">// Fetch from svn.<br />
</span><span style="color: #007700;">`</span><span style="color: #0000bb;">git svn fetch</span><span style="color: #007700;">`;</span></p>
<p><span style="color: #ff8000;">// Lookup all of the svn branches<br />
</span><span style="color: #0000bb;">$svnBranches </span><span style="color: #007700;">= </span><span style="color: #0000bb;">explode</span><span style="color: #007700;">(</span><span style="color: #dd0000;">"\n"</span><span style="color: #007700;">, </span><span style="color: #0000bb;">trim</span><span style="color: #007700;">(`</span><span style="color: #0000bb;">git branch -r | grep -v '/' | grep -v trunk</span><span style="color: #007700;">`));<br />
</span><span style="color: #0000bb;">$svnBranches </span><span style="color: #007700;">= </span><span style="color: #0000bb;">array_map</span><span style="color: #007700;">(</span><span style="color: #dd0000;">'trim'</span><span style="color: #007700;">, </span><span style="color: #0000bb;">$svnBranches</span><span style="color: #007700;">);</span></p>
<p><span style="color: #ff8000;">// Add all of our branches to the list of those to push<br />
</span><span style="color: #0000bb;">$cmd </span><span style="color: #007700;">= </span><span style="color: #dd0000;">'git push --tags github refs/remotes/trunk:refs/heads/master '</span><span style="color: #007700;">;<br />
foreach (</span><span style="color: #0000bb;">$svnBranches </span><span style="color: #007700;">as </span><span style="color: #0000bb;">$branch</span><span style="color: #007700;">) {<br />
</span><span style="color: #0000bb;">$cmd </span><span style="color: #007700;">.= </span><span style="color: #dd0000;">'refs/remotes/'</span><span style="color: #007700;">.</span><span style="color: #0000bb;">$branch</span><span style="color: #007700;">.</span><span style="color: #dd0000;">':refs/heads/'</span><span style="color: #007700;">.</span><span style="color: #0000bb;">$branch</span><span style="color: #007700;">.</span><span style="color: #dd0000;">' '</span><span style="color: #007700;">;<br />
}</span></p>
<p><span style="color: #ff8000;">// Ensure that Git tags are created for every SVN tag branch.<br />
</span><span style="color: #0000bb;">$svnBranches </span><span style="color: #007700;">= </span><span style="color: #0000bb;">explode</span><span style="color: #007700;">(</span><span style="color: #dd0000;">"\n"</span><span style="color: #007700;">, </span><span style="color: #0000bb;">trim</span><span style="color: #007700;">(`</span><span style="color: #0000bb;">git branch -r | grep 'tags/'</span><span style="color: #007700;">`));<br />
</span><span style="color: #0000bb;">$svnBranches </span><span style="color: #007700;">= </span><span style="color: #0000bb;">array_map</span><span style="color: #007700;">(</span><span style="color: #dd0000;">'trim'</span><span style="color: #007700;">, </span><span style="color: #0000bb;">$svnBranches</span><span style="color: #007700;">);<br />
foreach (</span><span style="color: #0000bb;">$svnBranches </span><span style="color: #007700;">as </span><span style="color: #0000bb;">$svnTag</span><span style="color: #007700;">) {<br />
</span><span style="color: #0000bb;">$ref </span><span style="color: #007700;">= </span><span style="color: #dd0000;">"refs/remotes/$svnTag"</span><span style="color: #007700;">;<br />
</span><span style="color: #0000bb;">$parent </span><span style="color: #007700;">= </span><span style="color: #0000bb;">shell_exec</span><span style="color: #007700;">(</span><span style="color: #dd0000;">"git show --format=\"format:%P\" $ref"</span><span style="color: #007700;">);</span></p>
<p><span style="color: #ff8000;">// If there are no tags on the parent of the tag branch, add one.<br />
</span><span style="color: #007700;">if (!</span><span style="color: #0000bb;">strlen</span><span style="color: #007700;">(</span><span style="color: #0000bb;">trim</span><span style="color: #007700;">(`</span><span style="color: #0000bb;">git tag --contains $parent</span><span style="color: #007700;">`))) {<br />
</span><span style="color: #0000bb;">$message </span><span style="color: #007700;">= </span><span style="color: #0000bb;">shell_exec</span><span style="color: #007700;">(</span><span style="color: #dd0000;">"git show --format=\"format:%s%ntagged by %aN on %aD\" $ref"</span><span style="color: #007700;">);<br />
</span><span style="color: #0000bb;">$date </span><span style="color: #007700;">= </span><span style="color: #0000bb;">shell_exec</span><span style="color: #007700;">(</span><span style="color: #dd0000;">"git show --format=\"format:%ai\" $ref"</span><span style="color: #007700;">);<br />
</span><span style="color: #0000bb;">$tagName </span><span style="color: #007700;">= </span><span style="color: #0000bb;">str_replace</span><span style="color: #007700;">(</span><span style="color: #dd0000;">'tags/'</span><span style="color: #007700;">, </span><span style="color: #dd0000;">''</span><span style="color: #007700;">, </span><span style="color: #0000bb;">$svnTag</span><span style="color: #007700;">);</span></p>
<p><span style="color: #0000bb;">$tagCmd </span><span style="color: #007700;">= </span><span style="color: #dd0000;">'GIT_COMMITTER_DATE="'</span><span style="color: #007700;">.</span><span style="color: #0000bb;">$date</span><span style="color: #007700;">.</span><span style="color: #dd0000;">'" git tag -a -m "'</span><span style="color: #007700;">.</span><span style="color: #0000bb;">$message</span><span style="color: #007700;">.</span><span style="color: #dd0000;">'" '</span><span style="color: #007700;">.</span><span style="color: #0000bb;">$tagName</span><span style="color: #007700;">.</span><span style="color: #dd0000;">' '</span><span style="color: #007700;">.</span><span style="color: #0000bb;">$parent</span><span style="color: #007700;">;<br />
</span><span style="color: #ff8000;">#        print $tagCmd ."\n";<br />
#        print "Creating tag $tagName\n";<br />
</span><span style="color: #007700;">`</span><span style="color: #0000bb;">$tagCmd</span><span style="color: #007700;">`;<br />
}<br />
}</span></p>
<p><span style="color: #ff8000;">#print $cmd;<br />
#print "\n";</span></p>
<p><span style="color: #0000bb;">$output </span><span style="color: #007700;">= `</span><span style="color: #0000bb;">$cmd 2&gt;&amp;1</span><span style="color: #007700;">`;</span></p>
<p>if (<span style="color: #0000bb;">trim</span><span style="color: #007700;">(</span><span style="color: #0000bb;">$output</span><span style="color: #007700;">) != </span><span style="color: #dd0000;">'Everything up-to-date'</span><span style="color: #007700;">)<br />
print </span><span style="color: #0000bb;">$output</span><span style="color: #007700;">.</span><span style="color: #dd0000;">"\n"</span><span style="color: #007700;">;</span></p>
<p></span></span></code>&nbsp;</p>
<p><code> </code>&nbsp;</p>
</div>
<p>I think that this script should work with very few changes for mirroring any Subversion repository as a Git repository.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.adamfranco.com/2010/12/05/mirroring-a-subversion-repository-on-github/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Adding reverse-proxy caching to PHP applications</title>
		<link>http://www.adamfranco.com/2010/06/14/adding-reverse-proxy-caching-to-php-applications/</link>
		<comments>http://www.adamfranco.com/2010/06/14/adding-reverse-proxy-caching-to-php-applications/#comments</comments>
		<pubDate>Mon, 14 Jun 2010 16:03:57 +0000</pubDate>
		<dc:creator>Adam</dc:creator>
				<category><![CDATA[Computers and Technology]]></category>
		<category><![CDATA[Work/Professional]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[reverse-proxy]]></category>
		<category><![CDATA[Varnish]]></category>
		<category><![CDATA[web-development]]></category>

		<guid isPermaLink="false">http://www.adamfranco.com/?p=426</guid>
		<description><![CDATA[Note: This is a cross-post of documentation I am writing about Lazy Sessions. Why use reverse-proxy caching? For most public-facing web applications, the significant majority of their traffic is anonymous, non-authenticated users. Even with a variety of internal data-cache mechanisms and other good optimizations, a large amount of code execution goes into executing a PHP [...]]]></description>
			<content:encoded><![CDATA[<p><em>Note: This is a cross-post of <a href="http://wiki.github.com/adamfranco/lazy_sessions/adding-reverse-proxy-caching-to-php-applications">documentation I am writing about Lazy Sessions</a>.</em></p>
<h1>Why use reverse-proxy caching?</h1>
<p>For most public-facing web applications, the significant majority of their traffic is anonymous, non-authenticated users. Even with a variety of internal data-cache mechanisms and other good optimizations, a large amount of code execution goes into executing a <span class="caps">PHP</span> application to generate a page even if the content of this page will be the same for many users. Code and query optimization are very important to improving the experience for all users of a web application, but even the most basic &ldquo;Hello World&rdquo; script will top out at about 3k requests/second due to the overhead of Apache and <span class="caps">PHP</span> &mdash; many real applications top out at less than 200 requests/second. Varnish, a light-weight proxy-server that can run on the same host as the webserver, can cache pages in memory and can serve them at rates of more than 10k requests/second with thousands of concurrent connections.</p>
<p>While the point of web-applications is to have content be dynamic and easily changeable, for most applications and most of the anonymous users, receiving content that is slightly stale (cached for 5 minutes or something similar) isn&rsquo;t a big deal. Sure, visitors to your blog might not see the latest post for a few minutes, but they will get their response in 4 milliseconds rather than 2 seconds.</p>
<p>Should your site get posted on Slashdot, a caching reverse-proxy server will give anonymous visitor #2 and up the same page from cache (until expiration), while authenticated users continue to have their requests passed through to the Apache/<span class="caps">PHP</span> back-end. Everyone wins.</p>
<p><span id="more-426"></span></p>
<h1>Caveats</h1>
<p>Before we get into how to set this up, you should be aware of a few caveats (in addition to increased complexity) that come with this scheme.</p>
<h2>1. Stale Content</h2>
<p>Ideally, pages would always be served from the cache for as long as they don&rsquo;t change, then the application would expire pages when they are changed on the back-end. Varnish has an <span class="caps">API</span> that supports this behavior and <a href="http://drupal.org/project/Varnish">Drupal Varnish module</a> is being developed to do this dynamic cache-clearing for Drupal sites, but overall, dynamic cache clearing is much more difficult to set up than time-based cache expiration.</p>
<p>When using time-based cache expiration, the challenge is to balance the needs for content freshness (shorter cache lifetimes) against the efficiency of cache hits (longer cache lifetimes will result in more clients using the cached versions). For content that doesn&rsquo;t need to be up-to-the-minute fresh, a cache lifetime of around 5 minutes might be a good starting point. If the content only changes daily at certain time, a fixed expiration time (shortly after the data sync) might be appropriate.</p>
<h2>2. Cookie Use</h2>
<p>If your application only uses a cookies set by PHP&rsquo;s <code>session_start()</code> function, then <code>lazy_sessions.php</code> should work transparently without modification of either that include file or your application (other than including the file). If your application sets other cookies then these will cause the reverse-proxy not to cache them unless you specifically exclude them in the reverse-proxy server&rsquo;s configuration.</p>
<h2>3. Data Caching in the <code>$_SESSION</code></h2>
<p>If you use the <code>$_SESSION</code> array as a data cache on anonymous requests, then these anonymous requests will be given a session cookie and their requests won&rsquo;t be served from the reverse-proxy&rsquo;s cache. Rather than using the <code>$_SESSION</code> array for non-user-specific data, cache such data with <span class="caps">APC</span> or memcached. This also has the advantage of such non-user-specific data not having to be rebuilt for every new client.</p>
<h2>4. <code>flush()</code> and output buffering</h2>
<p>The default <span class="caps">PHP</span> session handling mechanism adds the session cookie to the response headers right when <code>session_start()</code> is called and writes the data off to the file-system after the script exits and the data has been sent. This default behavior ensures that users will always get a session cookie and saves the session data as the final processing step after all class destructors have been called.</p>
<p>Since we don&rsquo;t want to always set a session cookie, we need to remove the <code>Set-Cookie</code> header before headers are sent to the client. Output buffering with <code>ob_start()</code> will ensure that we have a chance to decide to clear the <code>Set-Cookie</code> header at script shutdown.</p>
<p>In some cases (such as incrementally sending large binary files) we want to send the content body (and therefor also the headers) before the script exits using the <code>flush()</code> function. To ensure that the session cookie is properly removed <code>session_write_close()</code> must be called before <code>flush()</code> or any other code that causes headers to be sent.</p>
<h1>Implementation</h1>
<p>Implementing reverse-proxy caching has three steps: <span class="caps">PHP</span> changes to enable lazy sessions, <span class="caps">PHP</span> changes to set cache-controlling headers, and finally the reverse-proxy server setup. For this example I&rsquo;ll use the Varnish reverse-proxy server, but others could be used instead.</p>
<h2>1. <span class="caps">PHP</span>: Lazy Sessions</h2>
<p>The first thing that needs to happen to make anonymous requests cache-able in an application that uses sessions is to ensure that sessions are only started when there is session data to be stored. By default, PHP&rsquo;s session handling mechanisms add a session cookie to the response header and store a session data file on the server on page-load that calls <code>session_start()</code>. While this behavior makes it easy to write applications that use sessions, it effectively means that there is no way to differentiate between responses that are for a particular user and those that could be for many users.</p>
<p>Including the <a href="http://github.com/adamfranco/lazy_sessions/blob/master/lazy_sessions.php"><code>lazy_sessions.php</code> file</a> before <code>session_start()</code> is called will override the default session-handling mechanism with one that checks to see if there is any data in the <code>$_SESSION</code> array before sending the user a <code>Set-Cookie</code> header and storing a session file:</p>
<pre>
<code>&lt;?php

// Include files or other pre-session_start code

require_once('lazy_sessions/lazy_sessions.php');
start_session();

// The rest of the application code.
?&gt;
</code>
</pre>
<p>If your application needs to flush content and thereby send headers before script shutdown (such as incrementally sending file data), call <code>session_write_close()</code> if <code>session_start()</code> has been called for that script:</p>
<pre>
<code>&lt;?php

// Include files or other pre-session_start code

require_once('lazy_sessions/lazy_sessions.php');
start_session();

// other application code.

// If session_write_close() is not called before flushing, then the Set-Cookie
// header will be sent before our custom session handler has a chance to determine
// if a session is even needed.
session_write_close();

print "Hello";
flush();
print " World.";
flush();

?&gt;
</code>
</pre>
<h2>2. <span class="caps">PHP</span>: Cache-Control headers</h2>
<p>Now that we have our cookies straightened out, we need to ensure that our <span class="caps">PHP</span> scripts respond with <span class="caps">HTTP</span> headers that indicate that downstream clients such as our reverse-proxy and the user&rsquo;s browser are allowed to cache anonymous pages. There are a number of different <a href="http://wiki.github.com/adamfranco/lazy_sessions/cache-controlling-headers">Cache-Controlling Headers</a> that may affect whether a particular cache may store a given response. By default, <span class="caps">PHP</span> sets all of these headers to indicate that no caches may store any pages, ensuring that they are dynamic.</p>
<pre>
<code>&lt;?php

// If the session data is empty, then we could assume that there is no per-user data
// and that the response can be cached.
if (!count($_SESSION)) {

// Alternatively, we could check an application-specific value (such as a user-id)
// to determine if the response is for a particular user.
// if (!isset($_SESSION['user_id'])) {

// Cache for 5 minutes
$maxAge = 300;

header('Expires: '.gmdate('D, d M Y H:i:s', time() + $maxAge).' GMT', true);
header('Cache-Control: public, max-age='.$maxAge, true);
header('Pragma: ', true);
}

header('Vary: Cookie,Accept-Encoding', true);
</code>
</pre>
<p>The two most important headers with regard to caching with varnish are the following:</p>
<h3>The <code>Cache-Control</code> header.</h3>
<p>The <code>Cache-Control: public, max-age=300</code> header indicates to any clients (such as the Varnish caching proxy) that this response can be cached in public caches valid for many downstream clients. The <code>max-age</code> portion of the header indicates that the cache may store this response for 300 seconds.</p>
<p>As I understand it (possibly wrong) Varnish only looks at the <code>max-age</code> portion of the <code>Cache-Control</code> header when determining how long to store a response. Apparently it ignores the <code>Expires</code> header for its cache-expiration purposes, though this header is passed on to downstream clients.</p>
<h3>The <code>Vary</code> header</h3>
<p>The <code>Vary: Cookie,Accept-Encoding</code> header tells Varnish (and in-browser caches) that they should not respond with the cached version of a response if the request includes a cookie or a different cookie from the request that previously had its response cached. Similarly, if one client says that it accepts gzip encoding via an <code>Accept-Encoding: gzip</code> request header, then the cached response may be compressed with gzip and should not be sent in response to requests from clients that do not state that they accept gzip encoding.</p>
<p>While Varnish&rsquo;s behavior is to never cache or respond from cache when cookies are present, without the <code>Vary: Cookie</code> response header, browsers or other downstream caches may respond with a cached response valid for only anonymous users even though a cookie is now present.</p>
<p>See my notes on <a href="http://wiki.github.com/adamfranco/lazy_sessions/cache-controlling-headers">Cache-Controlling Headers</a> for more details about other headers and how they affect the Varnish cache and in-browser caches.</p>
<h2>3. Varnish (Reverse-Proxy) Configuration</h2>
<p>The <code>/etc/varnish/default.vcl</code> config file controls how Varnish responds to requests and responses, in particular whether or not it should cache or not. Below is the contents of my <code>default.vcl</code> file.</p>
<div>
<strong>Notes:</strong></p>
<ol>
<li>The backend portion is the default, you probably will want to modify this to point at your correct backend hosts and ports.</li>
<li>The <code>vcl_recv</code> and <code>vcl_hash</code> sections come directly from the <a href="https://wiki.fourkitchens.com/display/PF/Configure+Varnish+for+Pressflow?focusedCommentId=15335604">Pressflow wiki</a> and are set up to allow requests that include Google Analytics cookies to be cached while not caching requests that include other cookies.</li>
<li>The <code>vcl_fetch</code> section is the default with my addition of the lines to unset empty Set-Cookie headers that can&rsquo;t be removed from within <span class="caps">PHP</span> &lt; 5.3.</li>
</ol>
</div>
<pre>
<code>
backend default {
.host = "127.0.0.1";
.port = "80";
}

sub vcl_recv {
// Remove has_js and Google Analytics __* cookies.
set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js)=[^;]*", "");
// Remove a ";" prefix, if present.
set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", "");
// Remove empty cookies.
if (req.http.Cookie ~ "^\s*$") {
unset req.http.Cookie;
}

// Cache all requests by default, overriding the
// standard Varnish behavior.
// if (req.request == "GET" || req.request == "HEAD") {
//   return (lookup);
// }
}

sub vcl_hash {
if (req.http.Cookie) {
set req.hash += req.http.Cookie;
}
}

sub vcl_fetch {
if (!beresp.cacheable) {
	return (pass);
}

// If using PHP &lt; 5.3 there is no way to fully delete headers, so empty
// Set-Cookie headers may be in the response. Ignore these empty headers.
if (beresp.http.Set-Cookie ~ "^\s*$") {
	unset beresp.http.Set-Cookie;
}

if (beresp.http.Set-Cookie) {
	return (pass);
}
return (deliver);
}
</code>
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.adamfranco.com/2010/06/14/adding-reverse-proxy-caching-to-php-applications/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>High-availability Drupal &#8212; File-handling</title>
		<link>http://www.adamfranco.com/2009/09/09/high-availability-drupal-file-handling/</link>
		<comments>http://www.adamfranco.com/2009/09/09/high-availability-drupal-file-handling/#comments</comments>
		<pubDate>Wed, 09 Sep 2009 21:18:27 +0000</pubDate>
		<dc:creator>Adam</dc:creator>
				<category><![CDATA[Computers and Technology]]></category>
		<category><![CDATA[Work/Professional]]></category>
		<category><![CDATA[Drupal]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://www.adamfranco.com/?p=266</guid>
		<description><![CDATA[One of the requirements in the migration of our web sites to Drupal is that we create a robust and redundant platform that can stay running or degrade gracefully when hardware or software problems inevitably arise. While our sites get heavy use from our communities and the public, our traffic numbers are no where near [...]]]></description>
			<content:encoded><![CDATA[<p>One of the requirements in the migration of our <a href="http://www.middlebury.edu/">web</a> <a href="http://www.miis.edu">sites</a> to <a href="http://drupal.org/">Drupal</a> is that we create a robust and redundant platform that can stay running or degrade gracefully when hardware or software problems inevitably arise. While our sites get heavy use from our communities and the public, our traffic numbers are no where near those of a top-1000 site and could comfortably run off of one machine that ran both the database and web-server.<br />
<div id="attachment_297" class="wp-caption aligncenter" style="width: 541px"><img src="http://www.adamfranco.com/files/2009/09/1-SingleMachine.jpg" alt="Single Machine Configuration" title="Single Machine Configuration" width="531" height="332" class="size-full wp-image-297" /><p class="wp-caption-text">Single Machine Configuration</p></div><br />
This simple configuration however has the major weakness that any hiccups in the hardware or software of the machine will likely take the site offline until the issues can be addressed. In order to give our site a better chance at staying up as failures occur, we separate some of the functional pieces of the site onto discrete machines and then ensure that each function is redundant or fail-safe. This post and the next will detail a few of the techniques we have used to build a robust site.</p>
<p><span id="more-266"></span></p>
<h2>Pull out the database, use multiple web-servers</h2>
<p>The two main components of Drupal (and most similar web applications) are the webserver, which handles PHP execution and file-serving; and the MySQL database, which stores all data with the exception of uploaded files. By putting the database on a separate machine we can can have multiple machines acting as front-end web-servers, both of them reading and writing to the same database. In this way, it doesn&#8217;t matter which web-server handles a given request as they will both get the same information out of the database. With two or more web-servers, our platform gains some redundancy since one web-server can fail while the second keeps handling requests.</p>
<p>With both web-servers point at the same database server, the database server still remains a single point of failure. Database clustering can alleviate this problem, but will be the subject of a future post.</p>
<h2>Multiple web-server challenges</h2>
<p>This redundancy does come at a cost in complexity however, since we need to ensure that any uploaded files are available on both web-servers. There seem to be <a href="http://groups.drupal.org/node/1648">two primary ways</a> of tackling this problem (without resorting to costly and complex distributed file-system tools). The first is use rsync to copy files between the web-servers every few minutes.<br />
<div id="attachment_299" class="wp-caption aligncenter" style="width: 610px"><img src="http://www.adamfranco.com/files/2009/09/2a-Two-Web-servers-rsync.jpg" alt="Two web servers with rsync" title="2a - Two Web servers - rsync" width="600" class="size-full wp-image-299" /><p class="wp-caption-text">Two web servers with rsync</p></div><br />
While this is reasonably simple to set up between two web-servers, it comes with significant downsides:</p>
<ul>
<li>Files cannot be deleted in the sync as newly-added files will exist on only one web-server. Since the sync is two-way, there is no way for the rsync processes to tell the difference between a new file and a deleted file.</li>
<li>Requests that come to the &#8220;other&#8221; web-server will not be able to access new files until the sync happens.</li>
<li>If additional web-servers are added, the sync process needs to be updated on every existing web-server to include the new web-server</li>
</ul>
<p>The other alternative is to store uploaded files on a separate file-server, whose upload directory is mounted on each web-server using NFS. This method eliminates the synchronization problems, since all web-servers are essentially writing to the same directory.<br />
<div id="attachment_300" class="wp-caption aligncenter" style="width: 610px"><img src="http://www.adamfranco.com/files/2009/09/2b-Two-Web-servers-nfs.jpg" alt="Two web servers with NFS" title="2b - Two Web servers - nfs" width="600" class="size-full wp-image-300" /><p class="wp-caption-text">Two web servers with NFS</p></div><br />
On top of the complexity of adding a fourth machine (the file-server) to our mix, this method also leaves us with the file-server as a single point of failure &#8212; were it to go down, no uploaded files would be accessible.</p>
<h2>Best of both worlds</h2>
<p>In order to better solve this problem, the approach we took is to go the NFS route, but augment it with a backup copy of the files stored on the local file-system of each web-server. Every ten minutes or so a script (<a href='http://www.adamfranco.com/files/2009/09/sync_files.sh'>sync_files.sh</a>) runs that checks to see if the shared NFS directory is available, and if so syncs the uploaded-files to a backup location on the web-server&#8217;s file-system. This backup copy has its permissions set so that the Apache process cannot write to it, preventing synchronization problems if the shared NFS directory goes offline and we need to serve files out of the backup copy.<br />
<div id="attachment_302" class="wp-caption aligncenter" style="width: 610px"><img src="http://www.adamfranco.com/files/2009/09/3-Two-Web-servers-nfs+backup1.jpg" alt="Two web servers with NFS and local backup copies." title="3 - Two Web servers - nfs+backup" width="600" class="size-full wp-image-302" /><p class="wp-caption-text">Two web servers with NFS and local backup copies.</p></div><br />
A second script (<a href='http://www.adamfranco.com/files/2009/09/check_link.sh'>check_link.sh</a>) runs every minute and checks to see if the shared NFS directory is available. If it is offline, this script changes the symbolic link of our &#8220;files&#8221; directory so that Drupal will now use the read-only backup copy for its files. If the NFS directory comes back online, this script will again update the symbolic link to point at our writable shared NFS directory.</p>
<p>An important consideration in this setup is that the NFS share is mounted in &#8216;soft&#8217; mode so that file-access errors will time out quickly and allow for a timely switch-over to our backup files.<br />
<div class="wp-caption alignnone" style="width: 610px">
<pre>files.example.edu:/images       /mnt/files     nfs     soft    0 0</pre>
<p><p class="wp-caption-text">An example 'soft' mount line in /etc/fstab</p></div></p>
<p>If the default &#8216;hard&#8217; NFS mount is used, the check_link processes will hang indefinitely while trying to communicate with the file-server and never switch to our backup files.</p>
<p>Here is an example layout on the web-server to accomplish this setup:</p>
<pre style='width: 100%'># The scripts that will be run by cron:
/usr/local/bin/check_link.sh  # Run every minute
/usr/local/bin/sync_files.sh   # Run every 10 minutes

# The mounted NFS share:
/mnt/files/

# The backup copy of files:
/srv/files_read_only/

# The 'files' symbolic link, pointing normally at the NFS share:
/srv/files/ => /mnt/files/
# On NFS failure, this link will be switched to the backup directory:
/srv/files/ => /srv/files_read_only/

# The Drupal code directory:
/srv/drupal/
# The files directory for a site is a link into the switched files link
/srv/drupal/sites/www.example.com/files/ => /srv/files/www.example.com/files/
</pre>
<p>By mounting the shared NFS directory, keeping a read-only local copy of the files, and monitoring the state of the NFS directory we gain the following benefits:</p>
<ul>
<li>No problems with synchronization as all web-servers share the same remote filesystem.</li>
<li>Synchronization of the local backup copies is not a problem as this is always a one-way sync rather than a two-way sync between different web-servers.</li>
<li>While the NFS file-server is still a single point of failure, read access to the uploaded files (via the backup copy) will be restored after a maximum of one minute plus the NFS time-out (2 minutes by default for &#8216;soft&#8217; mounts).</li>
<li>The web-servers don&#8217;t need to know about each other, easing configuration if additional web-servers are added.</li>
</ul>
<p>This configuration adds an extra machine to the platform mix and a bit of complexity, but it makes normal operation robust (instant file availability to all web-servers) and allows for graceful degradation (file-access becomes read-only) if the file-server goes down.</p>
<p><em>Many thanks to our system administrator, Mark Pyfrom, for all of his help in developing and testing this platform.</em></p>
<p><em>* Update on 2009-09-10: added note about &#8216;soft&#8217; NFS mounts and an example file-system layout.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.adamfranco.com/2009/09/09/high-availability-drupal-file-handling/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Twitter Export Script</title>
		<link>http://www.adamfranco.com/2008/10/13/twitter-export-script/</link>
		<comments>http://www.adamfranco.com/2008/10/13/twitter-export-script/#comments</comments>
		<pubDate>Mon, 13 Oct 2008 15:47:35 +0000</pubDate>
		<dc:creator>Adam</dc:creator>
				<category><![CDATA[Computers and Technology]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://www.adamfranco.com/?p=88</guid>
		<description><![CDATA[I have been using Twitter as a log of my daily doings and wished to export my time-line for reformatting into a calender format. Unfortunately TweetDumpr just retrieves the list of Tweets using a single fetch request which is limited by the Twitter API to a maximum of 200 Tweets. (Update: apparently TweetDumpr can get [...]]]></description>
			<content:encoded><![CDATA[<p>I have been using <a href="http://twitter.com">Twitter</a> as a <a href="http://twitter.com/afranco_work">log of my daily doings</a> and wished to export my time-line for reformatting into a calender format. Unfortunately <a href="http://pantsland.com/2008/04/14/released-twitter-timeline-export-tweetdumpr/">TweetDumpr</a> just retrieves the list of Tweets using a single fetch request which is limited by the Twitter API to a maximum of 200 Tweets. (Update: apparently TweetDumpr <em>can</em> get more than 200 Tweets. It just didn&#8217;t say so in its description.)</p>
<p>I wanted to export all 600+ of my tweets, so I wrote the following little php script to accomplish this. I have not yet tested it with many concurrent users or added a form to select which user to update. Until I do so, I won&#8217;t be providing it as an end-user service. You are free to put it on your own machine and use it though.</p>
<p><strong>TwitterExport.php</strong></p>
<pre>&lt;?php
/**
 * This script will allow the export of complete user time-lines from the twitter
 * service. It joins together all pages of status updates into one large XML block
 * that can then be reformatted/processed with other tools.
 *
 * @since 10/13/08
 *
 * @copyright Copyright &copy; 2008, Adam Franco
 * @license http://www.gnu.org/copyleft/gpl.html GNU General Public License (GPL)
 */

$user = 'afranco_work';	// Replace this with your user name.

header('Content-type: text/plain');

$allDoc = new DOMDocument;
$root = $allDoc-&gt;appendChild($allDoc-&gt;createElement('statuses'));
$root-&gt;setAttribute('type', 'array');

$page = 1;
do {
	$numStatus = 0;

	$pageDoc = new DOMDocument;
	$res = @$pageDoc-&gt;load('http://twitter.com/statuses/user_timeline/'.$user.'.xml?page='.$page);
	if (!$res) {
		print "\n\n**** Error loading page $page ****";
		exit;
	}
	foreach ($pageDoc-&gt;getElementsByTagName('status') as $status) {
		$root-&gt;appendChild($allDoc-&gt;createTextNode("\n"));
		$root-&gt;appendChild($allDoc-&gt;importNode($status, true));
		$numStatus++;
	}

	print "\nLoaded page $page with $numStatus status updates.";
	flush();

	$page ++;
	sleep(1);

} while ($numStatus);

print "\nDone loading timeline.";
print "\n\n\n";

$root-&gt;appendChild($allDoc-&gt;createTextNode("\n"));
print $allDoc-&gt;saveXml();
</pre>
<p><br/><br />
<strong>Usage (assuming <a href="http://www.php.net">PHP</a> is installed)</strong></p>
<ol>
<li>Save the code above on your machine as twitter_export.php</li>
<li>Edit the code to change the <code>$user</code> variable to be your own Twitter username</li>
<li>From the command line run <code>php twitter_export.php</code></li>
<li>Copy/paste the XML output into a file for safe keeping and further processing</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.adamfranco.com/2008/10/13/twitter-export-script/feed/</wfw:commentRss>
		<slash:comments>27</slash:comments>
		</item>
		<item>
		<title>Outside-In:  Application Interoperability Using an OSID-Based Framework</title>
		<link>http://www.adamfranco.com/2008/06/25/outside-in-application-interoperability-using-an-osid-based-framework/</link>
		<comments>http://www.adamfranco.com/2008/06/25/outside-in-application-interoperability-using-an-osid-based-framework/#comments</comments>
		<pubDate>Thu, 26 Jun 2008 03:59:04 +0000</pubDate>
		<dc:creator>Adam</dc:creator>
				<category><![CDATA[Work/Professional]]></category>
		<category><![CDATA[Harmoni]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Segue]]></category>

		<guid isPermaLink="false">http://www.adamfranco.com/?p=76</guid>
		<description><![CDATA[This post describes an interoperability demonstration given at OpeniWorld Europe 2008 in Lyon, France. Abstract Segue and Concerto are two curricular applications built upon Harmoni, an Open Service Interface Definition-based (OSID) service-oriented application framework. This demonstration will show how website content created in Segue is stored as OSID Assets in Harmoni’s OSID Repository. Similarly, the [...]]]></description>
			<content:encoded><![CDATA[<p><em>This post describes an interoperability demonstration given at Open<span style='color: #a00'>i</span>World Europe 2008 in Lyon, France.</em></p>
<p><strong>Abstract </strong></p>
<p>Segue and Concerto are two curricular applications built upon Harmoni, an Open Service Interface Definition-based (OSID) service-oriented application framework. This demonstration will show how website content created in Segue is stored as OSID Assets in Harmoni’s OSID Repository. Similarly, the demonstration will show how multimedia assets created in Concerto can be stored  same repository. Interoperability will be demonstrated as each application is used to view and make real-time modifications to the OSID Assets created using the other application, while at the same time respecting the authorizations given to those assets. Additionally, an OSID Repository to OAI-PMH gateway will be shown providing the LibraryFind meta-search tool with access to the metadata for content created in Segue, Concerto, and a lightweight, read-only OSID Repository.</p>
<ul>
<li><strong>Companion Paper: </strong> <a href='http://www.adamfranco.com/files/2008/06/openiworld-europe-2008-paper.pdf'>PDF (76 KB)</a></li>
<li><strong>Presentation Slides: </strong><a href='http://www.adamfranco.com/files/2008/06/openiworld-europe-2008-slides.pdf'>PDF (7.4 MB)</a></li>
</ul>
<p><strong>Software Demonstrated:</strong></p>
<ul>
<li><a href="http://harmoni.sf.net">Harmoni Application Framework</a> (Middlebury College)</li>
<li><a href="http://segue.sf.net">Segue</a> version 2 (Middlebury College)</li>
<li><a href="http://concerto.sf.net">Concerto</a> (Middlebury College)</li>
<li><a href="http://www.libraryfind.org">LibraryFind</a> (Oregon State University)</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.adamfranco.com/2008/06/25/outside-in-application-interoperability-using-an-osid-based-framework/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Segue 2.0 &#8211; Beta 20</title>
		<link>http://www.adamfranco.com/2008/06/09/segue-20-beta-20/</link>
		<comments>http://www.adamfranco.com/2008/06/09/segue-20-beta-20/#comments</comments>
		<pubDate>Tue, 10 Jun 2008 03:23:52 +0000</pubDate>
		<dc:creator>Adam</dc:creator>
				<category><![CDATA[Work/Professional]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Segue]]></category>

		<guid isPermaLink="false">http://www.adamfranco.com/?p=75</guid>
		<description><![CDATA[Another week, another Segue 2 beta. This week&#8217;s installation brings visitor registration, a few new themes from Alex, theme migration from Segue 1, and a bunch of little bug fixes. Visitor registration brings with it a few interesting challenges. As in Segue 1, we want (and need) to be able to allow people outside of [...]]]></description>
			<content:encoded><![CDATA[<p>Another week, another Segue 2 beta. This week&#8217;s installation brings visitor registration, a few new themes from Alex, theme migration from Segue 1, and a bunch of little bug fixes.</p>
<p>Visitor registration brings with it a few interesting challenges. As in Segue 1, we want (and need) to be able to allow people outside of the Middlebury community to join in on public discussions hosted in Segue. As well, Middlebury users often need to give access to restricted parts of their sites to people off-campus with whom they are collaborating. Our visitor registration system therefore needs to be easy to use by registrants, keep out spammers, as well as enable searches for visitor accounts by community users.</p>
<p>To keep out spammers, the visitor registration form uses <a href="http://recaptcha.net/">reCAPTCHA</a> to try to verify that a human is sitting at the browser. There are other CAPTCHA systems out there, but I like the philosophy and approach of reCAPTCHA. Starting with words that OCR software had trouble reading seems like a good idea. After the registration form is filled out, Segue sends an email to the address entered with a unique registration code. Until the link in the email is clicked on (and hence the address verified) the account is locked.</p>
<p>To enable easy searching of visitor accounts, visitors are asked to enter their name. While there are a few restrictions on names, these are user-chooseble. To provide some measure of differentiation between verified institution accounts and visitor accounts visitor accounts have the user-chosen name followed by their email domain name in parenthesis, e.g.:</p>
<blockquote><p>Adam Franco (gmail.com)</p></blockquote>
<p>I weighed including the entire email address as that is the only verified information we have about the visitor accounts, but I&#8217;d rather not open that information up for harvesting by spammers. If abuse becomes an issue, the visitor registration system also supports both black-lists and white-lists of email domains.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.adamfranco.com/2008/06/09/segue-20-beta-20/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Segue 2 &#8211; The home stretch begins.</title>
		<link>http://www.adamfranco.com/2008/05/19/segue-2-the-home-stretch-begins/</link>
		<comments>http://www.adamfranco.com/2008/05/19/segue-2-the-home-stretch-begins/#comments</comments>
		<pubDate>Tue, 20 May 2008 04:02:24 +0000</pubDate>
		<dc:creator>Adam</dc:creator>
				<category><![CDATA[Work/Professional]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Segue]]></category>

		<guid isPermaLink="false">http://www.adamfranco.com/archives/73</guid>
		<description><![CDATA[We&#8217;ve recently announced our migration plans to the campus: We&#8217;ll be rolling out Segue 2 in mid-August for production use in the fall semester. I&#8217;ve now been working on Segue 2 directly or indirectly for 5 years, since June 2003. It has been a long road and it is wonderful to finally be cresting the [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.adamfranco.com/files/2008/05/segue-v2-04.png' alt='Segue 2 logo' style='float: right; margin-left: 10px; margin-bottom: 10px; border: 0px;'/>We&#8217;ve recently announced our migration plans to the campus: We&#8217;ll be rolling out <a href='https://segue.middlebury.edu/sites/segue/'>Segue 2</a> in mid-August for production use in the fall semester.</p>
<p>I&#8217;ve now been working on Segue 2 directly or indirectly for 5 years, since June 2003. It has been a long road and it is wonderful to finally be cresting the last rise. That said, as the <a href="http://sourceforge.net/tracker/?group_id=82171&#038;atid=565237">feature-request tracker</a> indicates, we still have a lot to do over the next 12 weeks.</p>
<p><strong>Theming</strong><br />
This past week I rebuilt the theming system for the 4th (and last before production) time. The challenge with the theming system is that we wanted to enable end-users to choose from a few straight-forward options for things like &#8216;overall color scheme&#8217;, &#8216;font size&#8217;, corner-treatment &#8212; not all of which mapped cleanly to CSS properties. As well, to enable more powerful themes, we needed to let theme developers wrap each content type with HTML tags in order to get some effects that are just not possible with plain CSS when the dimensions of the element are not known. Our first three theming implementations involved different PHP classes for each theme with method for setting various options. Each implementation had its own strengths and weaknesses, but they were all hideously complex and required theme developers to know PHP in order to do more than change the CSS. The new theme implementation scraps all of that complexity and defines themes as a set of CSS files and HTML templates, with associated images. An extension to this simple base adds an option listing (defined in XML) that enables placeholders in the CSS and HTML templates to be replaced with values from end-user-choose-able options.</p>
<p>With the new theming system in place in development Alex has set to work building the first three (Rounded Corners, Shadow Box, and Tabs) of the themes that will be distributed with Segue while I&#8217;ve been finishing up the user-interfaces for choosing theme options and enabling more advanced users to customize the theme CSS and HTML in their web-browser. So far Alex and I are pretty happy with the new theming system and its simplicity should give it much longer legs than our previous attempts.</p>
<p>While it won&#8217;t make it to production, I eventually plan to have a theme-gallery that users can choose to publish their designs to for use by the rest of the community.</p>
<p><strong>Up Next</strong><br />
With theming out of the way the following are some of the next areas I&#8217;ll be working on in addition to fixing bugs and working out smaller kinks:</p>
<ul>
<li>Templates &#8211; starting points for sites</li>
<li>Enabling embedded videos from trusted sites (i.e. YouTube, Vimeo, etc)</li>
<li>Visitor Registration</li>
<li>Copy/Move tools for Classic Mode</li>
<li>Display of RSS feeds</li>
</ul>
<p>Still a lot to do, but with each addition Segue 2 gets much closer to being able to take over as the primary course website system.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.adamfranco.com/2008/05/19/segue-2-the-home-stretch-begins/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>WordPress Enclosure Adder</title>
		<link>http://www.adamfranco.com/2007/09/06/wordpress-enclosure-adder/</link>
		<comments>http://www.adamfranco.com/2007/09/06/wordpress-enclosure-adder/#comments</comments>
		<pubDate>Thu, 06 Sep 2007 15:02:05 +0000</pubDate>
		<dc:creator>Adam</dc:creator>
				<category><![CDATA[Computers and Technology]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://www.adamfranco.com/?p=62</guid>
		<description><![CDATA[I&#8217;ve recently developed a small PHP script, the WPEnclosureAdder (source &#124; try) that goes through each item in an RSS feed, looks for links to YouTube videos or GoogleVideo videos, and then adds an enclosure tags for the videos. If multiple videos are found embedded in a post, then that post is duplicated in the [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve recently developed a small PHP script, the WPEnclosureAdder (<a href="http://www2.adamfranco.com/WPEnclosureAdder/WPEnclosureAdder.phps">source</a> | <a href="http://www2.adamfranco.com/WPEnclosureAdder/WPEnclosureAdder.phps">try</a>) that goes through each item in an RSS feed, looks for links to <a href="http://www.youtube.com">YouTube</a> videos or <a href="http://video.google.com/">GoogleVideo</a> videos, and then adds an enclosure tags for the videos. If multiple videos are found embedded in a post, then that post is duplicated in the feed for each additional URL to provide compatibility with the many RSS readers/video-podcast viewers that expect a single enclosure per post.</p>
<p>I wrote this script because I have been recently making heavy use of <a href="http://www.getmiro.com/">Miro</a> (formerly known as &#8220;<a href="http://en.wikipedia.org/wiki/Miro_Media_Player" title="About Miro on Wikipedia">The Democracy Player</a>&#8220;) to download videos from YouTube in order to watch them off-line. Miro also provides a nice UI for aggregating videos and remembers my spot when I go back to watching later (nice for long documentaries). Miro however, expects links to videos in RSS enclosure tags, something that WordPress (and probably other blogging software) doesn&#8217;t do for embeded videos.</p>
<p><a href="http://throwawayyourtelescreen.wordpress.com/">Throw Away Your Telescreen</a> is a video blog done by one of my favorite geo-political bloggers, <a href="http://complexsystemofpipes.wordpress.com/" title="Complex System of Pipes">Dave on Fire</a>, and a few others. In it they link out to the most interesting &#8220;documentaries, lectures, and interviews that follow a different editorial line&#8221; from the corporate press.  I highly recommend all of the videos on it that I <a href="http://www.adamfranco.com/?p=41" title="Confessions of an Economic Hit Man">have</a> <a href="http://www.adamfranco.com/?p=37" title="Money as Debt">seen</a>.</p>
<p>Throw Away Your Telescreen has all the makings of an indie-news channel, perfect for Miro which was developed to encourage <a href="http://participatoryculture.org/">participatory media and culture</a>. The only thing missing was to get the videos embedded in Throw Away Your Telescreen&#8217;s posts in such a way that Miro can find them. With the WPEnclosureAdder, this has now been done. Use <a href="http://www2.adamfranco.com/ThrowAwayYourTelescreenRSS.php" title="Copy this URL and add it as a channel in Miro.">this feed</a> to view Throw Away Your Telescreen in Miro.</p>
<hr /><strong>More about the WPEnclosureAdder:</strong></p>
<ul>
<li>View the <a href="http://www2.adamfranco.com/WPEnclosureAdder/WPEnclosureAdder.phps">source-code</a> of the latest version. (save-as to download)</li>
<li>License: <a href="http://www.gnu.org/copyleft/gpl.html">GNU General Public License</a> (GPL) version 3 or later</li>
<li>Requirements (for hosting it yourself): <a href="http://www.php.net">PHP</a> version 5.2 or later</li>
<li><a href="http://git.or.cz/">Git</a> Repository: http://www2.adamfranco.com/WPEnclosureAdder.git</li>
</ul>
<p>I wrote this script with Throw Away Your Telescreen in mind, but it should work with any other WordPress blog, and probably with RSS feeds generated from other blogging tools. To point it at another blog&#8217;s RSS feed, enter the feed url in the form below:</p>
<form action="http://www2.adamfranco.com/WPEnclosureAdder/WPEnclosureAdder.php" method="get">
<input name="source" value="http://" type="text" />
<input value="submit" type="submit" /> </form>
<p>Using my version will use my default search strings for YouTube and GoogleVideo videos. If you would like to change what is being searched for, please download the script, change the configuration, and host it on your own website. I have licensed the WPEnclosureAdder under the <a href="http://www.gnu.org/copyleft/gpl.html">GNU General Public License</a> (GPL) version 3 or later, so you are free to copy and modify this script as per the terms of that license.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.adamfranco.com/2007/09/06/wordpress-enclosure-adder/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>KML Joiner</title>
		<link>http://www.adamfranco.com/2007/08/29/kml-joiner/</link>
		<comments>http://www.adamfranco.com/2007/08/29/kml-joiner/#comments</comments>
		<pubDate>Wed, 29 Aug 2007 05:11:07 +0000</pubDate>
		<dc:creator>Adam</dc:creator>
				<category><![CDATA[Computers and Technology]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[GIS/GPS]]></category>
		<category><![CDATA[KML]]></category>
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://www.adamfranco.com/?p=60</guid>
		<description><![CDATA[As of a few days ago, I am now able to generate KML versions of Flickr photosets for viewing in Google Earth/Maps. With that taken care of, I also want to easily combine these KML documents of images together with other KML files that show additional information, such as paths traveled, points of interest, etc. [...]]]></description>
			<content:encoded><![CDATA[<p>As of a few days ago, I am now able to <a href="http://www.adamfranco.com/?p=43">generate KML versions</a> of Flickr photosets for viewing in Google Earth/Maps. With that taken care of, I also want to easily combine these KML documents of images together with other KML files that show additional information, such as paths traveled, points of interest, etc.</p>
<p>To accomplish this task, I have written a new script, the <a href="http://www2.adamfranco.com/kml_joiner.php"><strong>KML Joiner</strong></a>  that will combine any KML documents on the web together into a single (referenced) KML document. (<a href="http://www2.adamfranco.com/kml_joiner.php">try it out</a>)</p>
<p><strong>More Detail:</strong> <em>for those interested in KML</em><br />
The resulting document is a collection of network links, each of which points to one of the KML URLs specified. Doing this rather than combining their text together into a static KML document prevents style collisions as well as allows changes in the source data to propagate to the combined document.</p>
<p>Refresh intervals can optionally be specified for every source document allowing for a server-friendly combination of static data with rapidly changing data. By default, no refresh interval is specified, making the linked documents load only once when first accessed.</p>
<p><strong>Example:</strong></p>
<p>View the <a href='http://www2.adamfranco.com/kml_joiner.php?&#038;title=Conodoguinet+Conoe+Trip&#038;description=This+is+the+trip+mentioned+in+a+previous+%3Ca+href%3D%27http%3A%2F%2Fwww.adamfranco.com%2F%3Fp%3D59%27%3Eblog+post%3C%2Fa%3E%2C+but+this+time+the+data+sources+%281.+a+static+KML+file+with+the+path+and+house++placemark%2C+2.+a+dynamic+KML+document+generated+with+my+%3Ca+href%3D%27http%3A%2F%2Fwww2.adamfranco.com%2FphotosetToKML.php%27%3EPhoto+set+to+KML%3C%2Fa%3E+script%29+joined+together+with+the+%3Ca+href%3D%27http%3A%2F%2Fwww2.adamfranco.com%2Fkml_joiner.php%27%3EKML+Joiner+script%3C%2Fa%3E.&#038;urls%5B%5D=http%3A%2F%2Fwww2.adamfranco.com%2Fkml%2FConodoguinetPath.kml&#038;titles%5B%5D=Path&#038;refresh%5B%5D=&#038;urls%5B%5D=http%3A%2F%2Fwww2.adamfranco.com%2FphotosetToKML.php%3Fset%3D72157601703568728%26size%3Dsmall&#038;titles%5B%5D=Photos&#038;refresh%5B%5D='>KML Joiner  with fields filled in</a> that generates the map below.</p>
<p><iframe width="425" height="350" frameborder="no" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.com/maps?f=q&#038;hl=en&#038;ie=UTF8&#038;om=1&#038;q=http:%2F%2Fwww2.adamfranco.com%2Fkml_joiner.php%3F%26title%3DConodoguinet%2BConoe%2BTrip%26description%3DThis%2Bis%2Bthe%2Btrip%2Bmentioned%2Bin%2Ba%2Bprevious%2B%253Ca%2Bhref%253D%2527http%253A%252F%252Fwww.adamfranco.com%252F%253Fp%253D59%2527%253Eblog%2Bpost%253C%252Fa%253E%252C%2Bbut%2Bthis%2Btime%2Bthe%2Bdata%2Bsources%2B%25281.%2Ba%2Bstatic%2BKML%2Bfile%2Bwith%2Bthe%2Bpath%2Band%2Bhouse%2B%2Bplacemark%252C%2B2.%2Ba%2Bdynamic%2BKML%2Bdocument%2Bgenerated%2Bwith%2Bmy%2B%253Ca%2Bhref%253D%2527http%253A%252F%252Fwww2.adamfranco.com%252FphotosetToKML.php%2527%253EPhoto%2Bset%2Bto%2BKML%253C%252Fa%253E%2Bscript%2529%2Bjoined%2Btogether%2Bwith%2Bthe%2B%253Ca%2Bhref%253D%2527http%253A%252F%252Fwww2.adamfranco.com%252Fkml_joiner.php%2527%253EKML%2BJoiner%2Bscript%253C%252Fa%253E.%26urls%255B%255D%3Dhttp%253A%252F%252Fwww2.adamfranco.com%252Fkml%252FConodoguinetPath.kml%26titles%255B%255D%3DPath%26refresh%255B%255D%3D%26urls%255B%255D%3Dhttp%253A%252F%252Fwww2.adamfranco.com%252FphotosetToKML.php%253Fset%253D72157601703568728%2526size%253Dsmall%26titles%255B%255D%3DPhotos%26refresh%255B%255D%3D&#038;ll=40.220497,-77.243939&#038;spn=0.0212,0.057612&#038;output=embed&#038;s=AARTsJqZXZ5mh9brNetd7bmO_tBkz1S9oQ"></iframe><br/><a href="http://maps.google.com/maps?f=q&#038;hl=en&#038;ie=UTF8&#038;om=1&#038;q=http:%2F%2Fwww2.adamfranco.com%2Fkml_joiner.php%3F%26title%3DConodoguinet%2BConoe%2BTrip%26description%3DThis%2Bis%2Bthe%2Btrip%2Bmentioned%2Bin%2Ba%2Bprevious%2B%253Ca%2Bhref%253D%2527http%253A%252F%252Fwww.adamfranco.com%252F%253Fp%253D59%2527%253Eblog%2Bpost%253C%252Fa%253E%252C%2Bbut%2Bthis%2Btime%2Bthe%2Bdata%2Bsources%2B%25281.%2Ba%2Bstatic%2BKML%2Bfile%2Bwith%2Bthe%2Bpath%2Band%2Bhouse%2B%2Bplacemark%252C%2B2.%2Ba%2Bdynamic%2BKML%2Bdocument%2Bgenerated%2Bwith%2Bmy%2B%253Ca%2Bhref%253D%2527http%253A%252F%252Fwww2.adamfranco.com%252FphotosetToKML.php%2527%253EPhoto%2Bset%2Bto%2BKML%253C%252Fa%253E%2Bscript%2529%2Bjoined%2Btogether%2Bwith%2Bthe%2B%253Ca%2Bhref%253D%2527http%253A%252F%252Fwww2.adamfranco.com%252Fkml_joiner.php%2527%253EKML%2BJoiner%2Bscript%253C%252Fa%253E.%26urls%255B%255D%3Dhttp%253A%252F%252Fwww2.adamfranco.com%252Fkml%252FConodoguinetPath.kml%26titles%255B%255D%3DPath%26refresh%255B%255D%3D%26urls%255B%255D%3Dhttp%253A%252F%252Fwww2.adamfranco.com%252FphotosetToKML.php%253Fset%253D72157601703568728%2526size%253Dsmall%26titles%255B%255D%3DPhotos%26refresh%255B%255D%3D&#038;ll=40.220497,-77.243939&#038;spn=0.0212,0.057612&#038;source=embed" style="color:#0000FF;text-align:left;font-size:small">View Larger Map</a></p>
<p>The map above is of the trip mentioned in a previous <a href='http://www.adamfranco.com/?p=59'>blog post</a>, but this time the data sources (1. a static KML file with the path and house placemark, 2. a dynamic KML document generated with my <a href='http://www2.adamfranco.com/photosetToKML.php'>Photo set to KML</a> script) joined together with the <a href='http://www2.adamfranco.com/kml_joiner.php'>KML Joiner script</a> instead of manually put together with a text editor.</p>
<p><strong>Usage:</strong><br />
You are welcome to use this script <a href="http://www2.adamfranco.com/kml_joiner.php">hosted on my site</a>, or you can <a href="http://www2.adamfranco.com/kml_joiner.phps" title="Right-click and 'save as' to download.">download it</a> and run it on your own computer/webserver.</p>
<p>This script is available under the <a href="http://www.gnu.org/copyleft/gpl.html">GNU General Public License (GPL)</a> version 3 or later. (<a href="http://www2.adamfranco.com/kml_joiner.phps" title="View the sourcecode.">Source Code</a>)</p>
<p>Please post any suggestions for fixes or changes. Thanks!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.adamfranco.com/2007/08/29/kml-joiner/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Flickr Photo Set to KML</title>
		<link>http://www.adamfranco.com/2007/08/23/flickr-photo-set-to-kml/</link>
		<comments>http://www.adamfranco.com/2007/08/23/flickr-photo-set-to-kml/#comments</comments>
		<pubDate>Fri, 24 Aug 2007 00:03:15 +0000</pubDate>
		<dc:creator>Adam</dc:creator>
				<category><![CDATA[Computers and Technology]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[GIS/GPS]]></category>
		<category><![CDATA[KML]]></category>
		<category><![CDATA[photography]]></category>
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://www.adamfranco.com/?p=43</guid>
		<description><![CDATA[One of the things I (and others) have found lacking when working with geotagged images on Flickr, is the inability to retrieve a &#8220;photo set&#8221; (Flickr&#8217;s take on a slideshow) as a KML document that can then be displayed in GoogleEarth, GoogleMaps, or other geo-browsers. Flickr provides some KML links and GeoRSS feeds, but these [...]]]></description>
			<content:encoded><![CDATA[<p>One of the things I (<a href="http://www.ogleearth.com/2007/08/flickr_to_googl.html" title="Ogle Earth">and others</a>) have found lacking when working with <a href="http://en.wikipedia.org/wiki/Geotagging" title="Definition on Wikipedia">geotagged</a> images on <a href="http://www.flickr.com" title="Flickr.com">Flickr</a>, is the inability to retrieve a &#8220;photo set&#8221; (Flickr&#8217;s take on a slideshow) as a KML document that can then be displayed in GoogleEarth, GoogleMaps, or other geo-browsers. Flickr provides some KML links and GeoRSS feeds, but these are either limited to 20 items or can only be pointed at tags or users&#8217; photo-streams, not a particular photo set.</p>
<p>To fill this niche, I present a small script I wrote to generate a KML file from the geotagged photos in a set:</p>
<hr /><a href="http://www2.adamfranco.com/photosetToKML.php" title="Photo Set to KML" style='font-size: large; font-weight: bold;'>Photo Set to KML</a> &nbsp; &nbsp; (<a href="http://www2.adamfranco.com/photosetToKML.php" title="Photo Set to KML">try it out</a>)</p>
<p><br/><strong>Features:</strong></p>
<ul>
<li>Generate a KML file from a Flickr photo set</li>
<li>Directly open the KML file in Google Maps</li>
<li>Choose what size image to include in the placemark description for each photo.</li>
<li> Optionaly draw a path (line) from photo to photo ordered in one of several ways: by date taken, by date uploaded, by set order. Useful for making a quick and dirty map of a trip.</li>
</ul>
<p><strong>Examples:</strong></p>
<ul>
<li><a href="http://www2.adamfranco.com/photosetToKML.php?set=72157624466650125&amp;size=medium">KML</a> / <a href="http://www2.adamfranco.com/photosetToKML.php?set=72157624466650125&amp;size=small&amp;format=open_google_maps">GoogleMaps</a> &#8211; Some photos from Cape Cod.<br />
<iframe width="425" height="350" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.com/maps?f=q&amp;hl=en&amp;ie=UTF8&amp;om=1&amp;t=h&amp;q=http:%2F%2Fwww2.adamfranco.com%2FphotosetToKML.php%3Fset%3D72157624466650125%26size%3Dsmall%26path_order%3Dchron&amp;ll=42.06344,-70.1968&amp;spn=0.044605,0.072956&amp;z=13&amp;output=embed"></iframe><br /><small><a href="http://maps.google.com/maps?f=q&amp;hl=en&amp;ie=UTF8&amp;om=1&amp;t=h&amp;q=http:%2F%2Fwww2.adamfranco.com%2FphotosetToKML.php%3Fset%3D72157624466650125%26size%3Dsmall%26path_order%3Dchron&amp;ll=42.06344,-70.1968&amp;spn=0.044605,0.072956&amp;z=13&amp;source=embed" style="color:#0000FF;text-align:left">View Larger Map</a></small><br />
&nbsp;</li>
<li><a href="http://www.adamfranco.com/files/2007/08/turkey-2005-best.kml" title="Turkey Best.kml">KML</a> / <a href="http://maps.google.com/maps?f=q&amp;hl=en&amp;geocode=&amp;q=http:%2F%2Fwww.adamfranco.com%2Fwp-content%2Fuploads%2F2007%2F08%2Fturkey-2005-best.kml&amp;ie=UTF8&amp;z=6&amp;om=1">GoogleMaps</a> &#8211;  A set of photos from a trip I took around Turkey, with lines drawn chronologically. Since this is a large set that causes GoogleMaps to time-out, I&#8217;ve downloaded the KML file and then re-uploaded it to my website. This is the method I recommend for large photo sets.<br />
<iframe src="http://maps.google.com/maps?f=q&amp;hl=en&amp;geocode=&amp;q=http:%2F%2Fwww.adamfranco.com%2Fwp-content%2Fuploads%2F2007%2F08%2Fturkey-2005-best.kml&amp;ie=UTF8&amp;om=1&amp;s=AARTsJrgP9DYat4aFUkI62xSDNmbgvcQFw&amp;ll=38.634036,31.003418&amp;spn=6.006381,9.338379&amp;z=6&amp;output=embed" marginheight="0" marginwidth="0" frameborder="no" height="350" scrolling="no" width="425"></iframe><br />
<a href="http://maps.google.com/maps?f=q&amp;hl=en&amp;geocode=&amp;q=http:%2F%2Fwww.adamfranco.com%2Fwp-content%2Fuploads%2F2007%2F08%2Fturkey-2005-best.kml&amp;ie=UTF8&amp;om=1&amp;ll=38.634036,31.003418&amp;spn=6.006381,9.338379&amp;z=6&amp;source=embed" style="color: #0000ff; text-align: left; font-size: small">View Larger Map</a><br />
&nbsp;</li>
</ul>
<p>You are welcome to use this script <a href="http://www2.adamfranco.com/photosetToKML.php">hosted on my site</a>, or you can <a href="https://github.com/adamfranco/PhotoSetToKML/zipball/master" title="Download">download it</a> and run it on your own computer/webserver. If you would like to run it yourself, please be aware of the following&#8230;</p>
<p><strong>System Requirements</strong>:</p>
<ul>
<li><a href="http://www.php.net">PHP</a> 5.2 or greater.</li>
<li>The <a href="http://code.iamcal.com/php/flickr/readme.htm">PEAR Flickr API</a></li>
</ul>
<p>This script is available under the <a href="http://www.gnu.org/copyleft/gpl.html">GNU General Public License (GPL)</a> version 3 or later. (<a href="https://github.com/adamfranco/PhotoSetToKML" title="View the sourcecode on Github.">Source Code</a>)</p>
<p><strong>Updates:</strong>:</p>
<ul>
<li><strong>2011-10-06</strong>
<ul>
<li><a href="https://github.com/adamfranco/PhotoSetToKML" title="View the sourcecode on Github.">Moved the source-code to Github</a></li>
</ul>
</li>
<li><strong>2007-08-27</strong>
<ul>
<li>Now uses htmlspecialchars() to clean titles instead of htmlentities(), the latter of which was causing excessive translation of German characters. Thanks <a href='http://www.ogleearth.com/'>Stefan Geens</a>, for pointing this out.</li>
<li>Form now generates valid XHTML 1.0 strict.</li>
<li>Now can use image thumbnails instead of camera icons. Thanks for the idea Nicolas Hoizey.</li>
</ul>
</li>
<li><strong>2007-08-24</strong>
<ul>
<li>Now escapes ampersands in titles and descriptions. Thanks Jesse for pointing this out.</li
		</ul>
</li>
</ul>
<p><strong>Future Improvement Ideas:</strong>:</p>
<ul>
<li>Add an option for icon size.</li>
<li>Add options for custom icon/path styles. I&#8217;m not sure whether to give several options, or just provide a field for a block of arbitrary KML style-markup.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.adamfranco.com/2007/08/23/flickr-photo-set-to-kml/feed/</wfw:commentRss>
		<slash:comments>45</slash:comments>
		</item>
	</channel>
</rss>

