Data Analysis

This page provides access to some of the data and analysis results for the Pilot project. Please note that some of the data and results are preliminary at this stage.

  1. Raw sequence received from Washington University.
  2. Rename of WashU ID into project ID.
  3. Data cleaning.
  4. Metadata (pilot_metadata.txt).
  5. RDP output (rdp_output.tar.gz).
  6. Genera count.
  7. Top genera.
  8. Inter-subject variabilility (genera).
  9. Shared genera among individual samples.
  10. Chao and Shannon comparison at genus level.
  11. Phyla count.
  12. Top phyla.
  13. Inter-subject variabilility (phyla).
  14. Shared phyla among individual samples..
  15. Select fasta sequences.
    Along with 6 Ancient Archaeal Group (AAG) sequences, (AB525476, DQ522914, DQ522938, DQ640139, DQ640138, DQ640134)
  16. Multiple Sequence Alignment.
  17. Neighbor output file.
  18. Tree file.
  19. Environment file.
  20. Category File.
  21. Unifrac images.
  22. Click to enlarge the images below.


    Unifrac tif image format can be downloaded here
  23. Unifrac Distance.
  24. Unifrac cluster samples.
  25. Unifrac Jackknife cluster.
  26. HeatMap (genus, phylum)
  27. Click to enlarge the images below.


    HeatMap tif image format for Genus can be downloaded here
  28. Significantly enriched genera.
  29. Significantly enriched phyla.
  30. Correlated genera.
  31. Correlated phyla.
  32. Selected sequences for Ureaplasma, Mycoplasma and Neisseria and their BLAST outputs against RDP
  33. BLAST outputs for un-classified sequences

  34. BLAST output of filered sequences
  35. Parsed BLAST output (TopRDPblastHit.xls)
  36. SeqMatch results

  37. OTU analysis.