PHYS 291 project - analysis of fluorescence lifetime data
Press here for the project log.
Project description (original)
I will collect data from my own fluorescence lifetime experiments. The data sets are in form of a 128x128 matrix containing photon count, two lifetime components and amplitudes values for each pixel. Firstly I will plot the lifetime distributions (photon count vs. lifetime) and then I will try to make the distribution more accurate by eliminating background noise from the data set. I will calculate mean lifetimes and standard deviation from the data set. Also I will compare different samples to determine how many is needed to have a statistically significant number of data sets. I will then image the lifetimes with a color gradient indicating different lifetime values. Maybe also use the amplitude weightings to image the locations within the sample where the lifetime components differ in contribution. The data input and calculations will be done using C++ and the visualization will be done with ROOT.
Project description (modified)
The project description (original) which I wrote before actually starting the project had to be slightly modified. The following describes what the program does:
An introduction of the experiment, with background information follows in the introduction part. Results are presented and discussed in the results part.
Making a 1D histogram showing the chlorophyll fluorescence lifetime distribution of an algae sample using experiment data.
Fitting the data with a gauss and a landau distribution, and checking how good they fit the real data by calculating the chi square.
Eliminating annoying outliers from the data set(s) and plotting two lifetime components in the same histogram.
Making 2D histograms showing the algae cells as contour plots color coded for the lifetime components.
The source codes and the data files are available on the code page. The code files are commented, and further descriptions are provided under results below.
My PhD work is about measuring fluorescence lifetimes. A major part of this is to measure on chlorophyll in algae. The fluorescence lifetime is the average time a molecule is in an excited state before returning to the ground state by emitting light. This light is called fluorescence, and the average time for most biological molecules range from a few picoseconds to a few nanoseconds. In my experiments, fluorescence is induced by guiding a laser beam into a microscope, and the microscope software allows a scan of a chosen sample area. The lifetimes are measured using a technique called Time-Correlated Single Photon Counting (TCSPC). The laser sends out pulses in the femtosecond range, and the TCSPC module correlates the pulses with the emitted fluorescence, so that a histogram of the collected photons is built. This histogram represents the fluorescence decay curve from which the fluorescence lifetime is calculated. The shape of the curve also reveals if each pixel in the scanned area contains more than one lifetime component. The fluorescence is also induced using two-photon excitation - where the laser is tuned to 2 times the absorption wavelength. This significantly reduces damage to live specimens, since the photon density is only high enough for excitation in the focal point. It is also more accurate for the same reason, since unwanted fluorescence from other parts of the sample is eliminated.
The main reason fluorescence lifetimes are interesting, is that it yields more information than for example fluorescence intensity. An example is that if one measures the lifetime of chlorophyll in a fresh algae, and then in the same algae after it has been exposed to some kind of stress factor (i.e UV-radiation), a change in lifetime indicates changes on a molecular level. A factor determining the fluorescence lifetime is called quenching. To easier understand I propose a small thought experiment: Imagine that you have a number of excited molecules in a "room". The lifetime is the average time the molecules spend in the excited state. If the only way for them to de-excite is through fluorescence - this can be represented as the "fluorescence door": the only available door for the molecules to leave. If suddenly another door is opened, for example a "photochemistry door", then there are two ways of de-excitation and the room can be emptied faster. Therefore, a shorter lifetime indicates a stronger quenching, and the task is to find out what exactly is the cause of the quenching.
Each lifetime measurement in my experiments produces several data sets. Since the measurement is performed as a scan of 128x128 pixels, all the data is in the form of a 128x128 matrix. The data I have used in this project includes 1) the lifetime (two components), and 2) Photon count (intensity).
For more information on my work with fluorescence lifetimes, see the paper here.
To make comparisons easier, I include here a table from the above mentioned paper. The table shows chlorophyll lifetimes for the algae H.pluvialis in two different situations: 1) Green stage and 2) Red stage. There are great differences between this algae in the two stages, however, as shown in the table, the main lifetime component for chlorophyll is not so different. The table also shows the stages for "normal" cells and "DCMU" cells. DCMU is a well known photosynthesis inhibitor. This means that it hinders electron transfer from photosystem II to photosystem I, thus photochemical reactions are not possible when DCMU has been added to the cells. As a consequence, the fluorescence lifetime should be longer, since quenching of fluorescence has been reduced. This is also the case for the green-stage cells as shown in Table 1. For the red-stage cells, the lifetime is not longer, so something else must be compensating for the lack of photochemistry.
TABLE 1. Chlorophyll fluorescence lifetimes for green and red stage
H.pluvialis, for both normal and DCMU-inhibited cells. a1 and a2
are amplitudes describing the average contribution of the two lifetime
1. Make a 1D histogram
The first thing I did was to make a 1D histogram of a single lifetime component. Simple enough, however, there were many problems getting there. The first major problem was to read the data from the 128x128 matrix (containing one lifetime value per element) and be able to fill these values into a histogram. This problem was solved after getting help from Boris Wagner. A 1D histogram of the lifetime distribution of lifetime component 1 in chlorophyll a in the green algae H.pluvialis is shown in Figure 1. Note that this is the lifetime calculated from only one algae sample. The distribution has also only been graphically cut, so that outliers are not shown in the histogram. The problem of outliers has been dealt with in other ways for the rest of the project (see below). The source code for making this histogram is available from the file distribution1.C on the program code page with comments in the code. The data set used is called 1_t1.asc, available in the same place.
FIGURE 1. The lifetime distribution of lifetime component 1 in chlorophyll a in the green algae H.pluvialis.
As seen in Figure 1, the mean lifetime is 389 ps. Comparing to the lifetimes from Table 1, one should think that the data set 1_t1.asc is lifetime component 1 from a measurement of green-stage DCMU-inhibited cells. And that would be a correct assumption.
2. Fitting the data
Figure 2 shows the same histogram as Figure 1, with a Gaussian fit. The number of bins have also been increased from 100 to 200.
FIGURE 2. The same lifetime distribution as shown in Figure 1, fitted with a gaussian. Clearly, the fit is not optimal.
The chi-square / number of degrees of freedom (ndf) is in this case 34, and for a good fit it should be close to 1. An improved fit is shown in Figure 3. This is the Landau fit, and the chis-square / ndf value is 21 for the same parameter settings. It is easy to see from the histogram that none of these are good fits for the data set. The Landau fit would improve significantly if the data below 200 ps was cut. A user-defined function would be required to fit this particular data set properly. The point of including these fits were not, however, to find the best fit for this particular data set, but rather to show a few examples of how it is possible to fit a data set in ROOT. Note that for all continuous distributions, P(X=x) = 0. The entry for probability in the stats box is unnecessary, however, I was unable to find a way to leave it out and still keep the chisquare/ndf entry (see comment in source code).
FIGURE 3. The same distribution as shown in Figure 1, fitted with landau.
The source code for making both figures 2 and 3 is called distribution1fit.C and is available on the code page.
The fit is better than the gaussian, but far from good.
3. Outliers and plotting two lifetime components
All experimental physicists know (and despise) outliers. Data values that are extreme in some way, and should be eliminated before analysing the data. The origin can be background noise, system feedback, reflections amongst others. There are also a number of ways to deal with the outliers. In the first section I simply cut out graphically lifetime values that were very low or very high, however, any numerical results calculated from the data set will of course include the outliers and this should be avoided. For this section I decided to get rid of outliers in another way by taking advantage of the way my data sets are produced from the experiment. A 128x128 image is produced in the experiment, where each pixel contains the lifetime value, photon count and so on. The values from each pixel correlates, meaning that pixel (1,1), or matrix element (1,1) from i.e the lifetime data set corresponds to the element (1,1) in the photon count data set. The idea is that pixels with a very low number of photons is probably background noise, and pixels with a very high number of photons is most likely some kind of saturation, in both cases the lifetime would not be reliable and should be removed. I therefore decided to read first the photon count data set, and manually setting an upper and lower limit for number of photons that is "good", and then making a new 128x128 matrix where all elements with a value outside these bounds will be set to zero. All other elements will be set to one. Then I will multiply this new matrix with the lifetime matrix, thereby setting the lifetime value of all the "bad" pixels to zero, and the rest will keep their original value.
The fluorescent molecules in a living cell are, of course, smaller than the pixels in the image we have produced. We set excitation and emission wavelengths in such a way that we are sure that only chlorophyll a lifetimes are included in the measurement, but in a single pixel there might very well be two or more different "types" of chlorophyll molecules. For most of my measurements on live algae cells, there are two types of chlorophyll molecules, and they can be distinguished by their very different fluorescence lifetimes. Note that they can't be distinguished by fluorescence intensity measurements, which illustrates the usefulness of lifetime measurements. Chlorophyll molecules work together as an antenna array, gathering photons from sunlight and transferring the energy to a reaction centre, where photochemistry starts. However, sometimes, these reaction centres are closed, for various reasons - perhaps the array is saturated with energy and they need to avoid photo damage. In such a case, the fluorescence lifetime would be different from open reaction centres. So getting information about how many lifetime components there are, and the relative amplitudes of the components give more information on what's going on in the cells. In Figure 4, both lifetime components tau_1 and tau_2 have been subjected to the mentioned method of eliminating outliers, and plotted in the same histogram.
FIGURE 4. Both lifetime components of chlorophyll in red stage normal H.pluvialis cells.
The results presented in Figure 4 are still from only a single measurement, however, if we compare the lifetime values with the ones from Table 1, we see a good agreement with these data being from red stage normal cells.
The source code for this part is called lifetimes.C and the data sets used are called tau1.asc, tau2.asc and photons.asc. They are all available on the code page.
4. 2D plot and visualisation
In addition to finding the fluorescence lifetimes and examining how they change in a given situation, the location in the cells are also useful to investigate. In this section I have made contour plots of the same two lifetime components described in section 3, shown in Figure 5 (lifetime tau1) and Figure 6 (lifetime tau2). I have also added an image of the cell generated from the experiment analysis software (SPCImage, Becker&Hickl), which shows a color coded lifetime image of lifetime component 1. There is a significant disadvantage with the SPCImage, since it only allows very limited manipulation of the data used to generate the image. Therefore, it is better to use ROOT for such analysis. The source code for generating these histograms is called visualisation.C and is available on the code page. The data sets used are the same as described in section 3: photons.asc, tau1.asc and tau2.asc.
FIGURE 5. A 2D histogram imaging of the lifetime component tau1.
FIGURE 6. A 2D histogram imaging of the lifetime component tau2.
Looking at the images in Figures 5 and 6, it is clear that it is possible to identify areas within the cell by the lifetime. By closer inspection it also seems like there is a correlation between the two lifetime components. As seen in Figure 4, the second component is much more spread out, ranging from 1000-2500 ps. This tells us that the chlorophyll molecules responsible for the second lifetime component are more dynamic than the ones responsible for the first lifetime component.
FIGURE 7. An image of the algae cell generated by the
SPCImage software, color coded for the lifetime component tau1.
Finally, I have included a lego plot of the tau1 lifetime. It is not very useful, but it looks good.
FIGURE 8. A 2D lego histogram showing the algae cell.
Conclusion and outlook
Before I started taking this course I had never heard of ROOT, and I had done extremely little programming - and no C++ programming at all. After working with this project for the last month, I am finally starting to feel that I can actually DO something with these tools. Altough my work is far from sophisticated, at least I have made progress. And although I have already published a paper on the measurements I have used in this project, the vast possibilities of ROOT makes this project - and the results I have presented here - very useful for future research. There are so many options for manipulating and visualising the data, that I will definitely continue the learning process so that I can make use of ROOT in my future work.
Made by Arne Skodvin Kristoffersen