The PGP is not a traditional research study

Starting in 2005 as a pilot experiment with 10 individuals, the Harvard Personal Genome Project (Harvard PGP) pioneered a new form of genomics research. The main goal of the project is to allow scientists to connect human genetic information (human DNA sequence, gene expression, associated microbial sequence data, etc) with human trait information (medical information, biospecimens and physical traits) and environmental exposures.

Project participants consent to provide biological samples from themselves in order to perform whole genome sequencing, and use of these materials for biological research. The project now has over 5,000 participants.


Sharing data is critical to scientific progress, but has been hampered by traditional research practices—our approach is to invite willing participants to publicly share their personal data for the greater good. The project is dedicated to creating public genome, health, and trait data.

In traditional research projects, participants give samples and trait information to researchers. Usually, data in such studies is closely guarded until publication, and rarely shared in a form that would allow fellow scientists to reproduce their findings or use the same material or data for studies that may even be unrelated to the original study goals. In addition, most studies do not seek consent to share individual genetic data publicly. However, for other researchers to make use of that same data, the connection between an individual’s trait information or health care records and their genetic data is critical to scientific understanding, advancement and reproducibility.

In sharp contrast, PGP participants consent to publicly share their genomic and trait data in a free and open manner to be used for unimpeded research and other scientific, patient care and commercial purposes worldwide. Consistent with this consent, the project organizers seek to lower as many barriers as possible to access PGP data and cells to empower and engage the scientific community to drive new knowledge about human biology.

Initiated by George Church at Harvard Medical School in 2005, the Personal Genome Project has pioneered ethical, legal, and technical aspects related to the creation of public resources involving highly identifiable data like human genomes.

Public data, methods, and materials

We believe sharing is good for science and society. Our project is dedicated to creating public resources that everyone can access. Privacy, confidentiality and anonymity are impossible to guarantee in a context like this research study where public sharing of genetic data is an explicit goal. Therefore, our project collaborates with participants who are fully aware of the implications and privacy concerns of making their data public. Volunteering is not for everyone, but the participants who join make a valuable and lasting contribution to science.

Ongoing participatory research

We respect the people behind the data, and we aim to maintain strong relationships with participants. We want to collaborate on tracking health and other traits as they unfold over time. We also want to better understand the benefits and risks related to accessing and sharing personal genomes and other types of data.

Genomes, environments, and traits

The genome is just a part of the story: genes interact with the environment to form traits. Participants may choose to contribute other public data to build public records of their health and traits. We also try to connect participants with research, education, and citizen science projects that are connected to personal genome data.

Genomic data and identifiability

It is important to understand that the specific order of 6 billion A, C, G and Ts that represent a human genome (3 billion from each parent) are unique to an individual versus all other people on the planet. One possible exception is an identical twin who might share almost identical genome sequences. Long before the whole human genome was sequenced, law enforcement and forensic scientists were able to identify individuals from just small parts of the human genome that varies widely from individual to individual. Therefore, genomic data that is published can be highly identifiable. Participants in the PGP realize the value of data to science and make personal decisions about how much data to share in their public profiles.

Additional project goals

Part of the PGP is a social experiment in open data sharing of personally identifying information, and we are interested in

  • Exploring the opportunities, impacts and risks of public genomics research;
  • Developing a public dataset of information to aid in the development of analytical tools for scientists, clinicians and individuals; and
  • Educating the public about the potential benefits, risks, and uncertainties posed by the widespread availability of genetic and related information.

The PGP also seeks to develop a model system for experts on health care, molecular biology, genetic counseling, public health, law, education, and research to come together and collaborate. We hope that the PGP’s datasets will help to extend such discussions to the creation of case studies and to find out what individuals, clinicians, and researchers might want or not want in such datasets, and why.


More information on participation is available here, but briefly, the process involves:

  • Reading and agreeing to the Consent form
  • Scoring perfectly on a comprehensive exam to make sure that you understand the study and potential risks
  • Once enrolled, you will be assigned a unique may participate in a number of optional activities. Generally speaking, the more activities you participate in, the more valuable your data is because it allows multiple ways for researchers to compare your traits with each other and with your genetic data. Depending upon current project funding, as a participant, you may participate in:
  • Whole genome DNA sequencing (funding permitted) – also requires a blood or saliva sample
  • Blood donation for cell line creation that will be sharable with researchers worldwide
  • Dozens of trait and disease surveys
  • PGP participants may also engage in a number of open science projects including:
  • Participation in additional Third Party research activities that return results to participants to allow participants to include them in their public PGP Identifier. Some of these studies are launched as intramural studies by PGP staff members or Church Lab scientists, and others are extramural activities by independent research groups.
  • Attend the annual Genomes, Environments and Traits (GET) conference.

Funding Sources

Sources of funding for the Harvard PGP have varied over time.  Current sources of funding are listed here.

The chart below shows the project’s growth since 2011.

PGP enrollment over time

Part of a global PGP network

The Harvard PGP is a member of the Global Network of Personal Genome Projects (PGP), a group of research studies creating freely available scientific resources that bring together genomic, environmental and human trait data donated by volunteers.