Scott's Research Ramblings: July 2014

Sunday, July 20, 2014

Haemophilus sRNA?

I was looking at the old RNA seq data, specifically regions between genes, and I found an enriched region between the gene for CRP and the upstream gene, it looks like a potential small RNA. Here:

It seems like it's actually differentially expressed over time (note, bottom graph is log scale) which is why I chose to closely look at this region in particular. The coverage is strikingly consistent at it's peak. I put the sequence into Rfam and I got a single result for the 3' end: C4 antisense RNA. It appears to target the ant (adenine nucleotide translocator) gene and originates from a phage. As far as I know, Haemophilus influenzae does not have an ant gene. It does appear to mostly follow the structural template for this RNA as seen below:

I hope to find other possible sRNA using a similar technique. Stay tuned.

RNA seq Analysis

I took a look at some RNA seq data that was previously acquired by a post-doc in our lab and I've been doing various things to just practice analyzing large datasets. Here's some interesting things I've done:

(note: K = KW20, U = HI_0659-, T = HI_0660-, S = sxy-)

Coverage of sxy mRNA

Nucleotide-resolution of the sxy mRNA reads. There appears to be less expression in the HI_0659 knockout and possibly more in the HI_0660 knockout. The two KW20 replicates appear to differ but I think it is a timing issue since 10 min and 30 min lines in the B replicate appear to be averaged versions of the 10 min and 30 min lines in the A replicate.

Number of fragments that start/end at a particular base (KW20)

The common drop around base #255 (especially at 30 min) interested me for quite a bit as a possible cleavage site but it could also be due to specific secondary structural elements that encourage strand breakage during fragmentation. I plan on creating graphs like these for the rest of the genome so I'll see if this is a common thing.

Timecourse for all genes

I wrote an R script that creates a timecourse-type graph for all the genes under every condition. Here's what they look like:

The competence genes actually stand out because the strains really differ at the 30 min mark:

I also tried normalizing the score a couple different ways. First I normalized by the sum of the fragment counts for the whole genome for each condition, second I used the value at the 75% quartile for each condition. Here's a comparison (chosen at random):

1. Normalized by sum

2. Normalized by quartile

I think I prefer the second method.

Finally, one interesting thing I noticed was that the trp operon appeared to have similar characteristics to the competence proteins:

One trpR regulated operon

Another trpR regulated operon

trpR

I'm not sure if there is a known connection between the trp operon and competence regulation? There appears to be according to these graphs (and others). Perhaps it's something to pursue.

Friday, July 11, 2014

Coming up next...

I just want to clarify where I'm currently at and what paths I can take from here.

So I've completed my big transformation experiment as described in previous posts. Here's the resulting data:

The first two graphs show the difference in transformation frequency between strains and under different conditions. The second pair of graphs shows the ratio between transformation frequency in hfq+ and hfq- strains.

A few things stand out and I would like to pursue them.

1) The sxy-1 hfq- double mutant apparently has a higher transformation frequency than the single sxy-1 mutant. This is striking. I would like to see if this is consistent across other sxy point mutations. I'm in the process of making a sxy-2 (hypercompetent) hfq- mutant to see if this is consistent. I would also like to construct a sxy-7 (decreased competence) hfq- mutant and see the effects of strengthening the mRNA structure because I'm starting to think there may be some regulation happening at the RNA level. The problem with sxy-7 is that it decreases the natural transformability of the cell, so it's much harder to get the hfq knockout into it's genome. I'll see if I can find some way.

2) For some reason, in the simple hfq- strain, there appears to be a consistent TF across all conditions except MIV and log + cAMP. Something must be going on to cause the TF to fall after adding cAMP as the culture gets dense. I would like to do a time-course experiment for the hfq- mutant and perhaps try adding the cAMP at different times to see what happens.

3) I would also like to try and see where the effect of hfq is happening on the competence pathway. Back in the 90's, someone in the lab created a strain that expresses beta galactosidase in the presence of the CRP-cAMP complex. I would like to see how the expression of this reporter gene changes in the hfq knockout verses an hfq+ strain. I am currently growing the hfq+ version of this strain and I plan on transforming hfq- DNA into it.

That'll give me something to do.

Friday, July 4, 2014

Why Fructose?

Today I learned that only the fructose phosphotransferase system (PTS) is present in the Rd strain of Haemophilus influenzae. This means that H. influenzae cells rely on fructose as their primary source of sugar. This is odd considering glucose is much more abundant in respiratory mucus where we would expect to find H. influenzae (meaning it would make more sense if the cells relied on glucose rather than fructose). When the cell's preferred sugar is not available, the cell turns on genes that allow use of other sugars (and also, the cell turns on it's competence genes and becomes ready to uptake DNA).

I was just thinking about this and I hypothesized that perhaps the cell relies on fructose because it wants to turn on it's competence genes often - thus relying on a scarce sugar would keep the cells competence machinery in production more often. I also thought that perhaps it's actually more effective to use DNA as a source of energy than glucose, so I took to the internet and I tried counting expected ATP output for glucose vs. nucleotides:

Glucose
+2 net ATP from glycolysis
+2 ATP from citric acid cycle
+32 from electron transport chain (let's say)
36 net ATP production from glucose breakdown

To transport glucose using a PTS system, it requires an equivalent of 2 ATP, however one of these reduces the amount of ATP needed in glycolysis by 1 so it costs 1 ATP to transport glucose into the cell.

Nucleotides
I'm going to assume the cost of breaking dsDNA down into deoxyribose and nitrogenous bases is 0 which may be entirely wrong.

Deoxyribose
-1 ATP to split into glyceraldehyde-3 phosphate and acetylaldehyde
+18 ATP since glyceraldehyde-3 phosphate undergoes half of glucose catabolism
+1 ATP to balance the fact that an early energy costing step of glycolysis was skipped
+14 ATP produced from acetyl-coA and NADH produced from acetylaldehyde
32 net ATP production from deoxyribose breakdown

Nitrogenous base
De novo synthesis of a base is 4 ATP so I will say that the base is worth +4 ATP since rapidly dividing cells are always in need of bases.

That makes 36 ATP molecules per nucleotide which means the two break even in terms of energy yield. The real question is how much energy does it take to uptake DNA per base? I don't know if the answer's known. If it's less than 1 (since entire strand are taken up), than it's possible that using DNA would in fact a more effective way to produce energy than sugar.

I don't think that the cells would rely on DNA over sugar for energy, but it was fun to do the accounting. I know there's a lot of DNA in mucus and that any sugar supply would be competed for by the host cells, so there is some reasoning behind this idea, but again, I'm not really convinced.

Also, to get these ATP values, I had to look at biochemical pathways and just count the reactions that required ATP. There may be errors (ie. that fact that I was using mostly human pathways since more information is available) so if you see them, let me know with a comment.