Welcome to College Value

Where's This Come From?

All of the data you see here comes from the Department of Education's wonderful College Scorecard dataset. Specifically, it's mashed up from two huge CSV files: Most-Recent-Cohorts-Institution.csv and Most-Recent-Cohorts-Field-of-Study.csv. This data is compiled based on the subset of students who take out federal student loans or grants, so it's not by any means a complete picture. There are also significant gaps in the data where costs or earnings are unknown or privacy suppressed.


Data Vintage

The College Scorecard data is updated several times per year. The data currently used on CollegeValue was published in October 2024, which includes:


  • Refreshed institution-level earnings for 6, 8, and 10 years after starting school (updated June 2024)
  • Refreshed field-of-study earnings for 1 and 2 years after graduation, plus new 5-year earnings data
  • Updated Federal Student Aid metrics including operating status, accreditation, and cohort default rates (October 2024)
  • Updated IPEDS-derived metrics for enrollment, admissions, completion rates, and institutional characteristics

It's important to understand that "most recent" doesn't mean current-year. Earnings are measured from de-identified tax records with a significant lag. For field-of-study data, the DOE combines students into 2-year cohorts (e.g., graduates from 2019-20 and 2020-21 pooled together) to increase sample sizes and reduce privacy suppression. The 1-year post-graduation earnings you see for a given major represent earnings roughly 2-3 years ago. Institution-level earnings at 6-10 years after enrollment look even further back.


Program-level data collection through NSLDS only began in the 2014-15 award year, which limits how far back field-of-study cohorts can go.



To be honest, some of the data is kind of a pain. For instance, there are multiple colleges with the same name, which is especially true for cosmetology schools. Some are different branches in different areas and some are unrelated. There are other cases where the OPEID or UNITID fields are either blank or incorrect (for example, they refer to the main campus rather than the satellite campus that the data is for). In other cases, the primary URL for a school is empty or just plain wrong. So there's issues here and there. As we find these, we'll try to get them updated.


The data is only based on federal student loan and grant programs, so it does not include those who take out private loans, or have the good fortune to be able to cover the cost of college themselves, or scholarships, etc.

Graduation Rates

I was surprised to learn that the Dept. of Education measures completion rates at the 8 year mark. So when you look on collegescorecard at a specific school and see their graduation rate - for example, at Granite State College you'll see a rate of 42%. This is the 8 year rate, which they mention in the small print infobox, if you hover over it. There's probably good reasons for this. Granite State is an online college, and so the students are going to be far more likely to already have fulltime jobs and families. But it seems a bit disingenuous, especially when the 6 year graduation rate is only 14% and the 4 year rate is 3%!

Incomes

Incomes for field rankings are based on median earnings 1 year post-graduation. At the college level, median earnings are available at 6, 8, and 10 year marks after enrollment.

The DOE breaks earnings down by family income tercile: low-income ($30,000 or less), middle-income ($30,001-$75,000), and high-income ($75,001+). The data only includes students who are not currently enrolled (e.g., those in graduate school at the time of measurement are excluded), so the figures may undercount high-earning graduates who continued their education.

One of the reasons for building this site in the first place is to test the hypothesis that the correlation of majors and colleges together often matter more than either variable alone. It appears that this is the case, for example the expected earnings across nursing degrees varies wildly, especially when accounting for debt loads and graduation rates.

It's important to note that this is still a very limited and possibly skewed view. Some details noted by the DOE:

"There are two notable limitations that researchers should keep in mind for all of these metrics. First, research suggests that the variation across programs within an institution may be even greater than aggregate earnings across institutions. For information related to more recent earnings calculations by field of study, please see the technical documentation for field of study data files. Second, the data include only Title IV-receiving students, so figures may not be representative of institutions with a low proportion of Title IV-eligible students. Additionally, the data are restricted to students who are not enrolled (enrolled means having an in-school deferment status for at least 30 days of the measurement) so students who are currently enrolled in, for example, graduate school at the time of measurement are excluded."

Debt Loads

One key insight is that the amount of debt incurred is independent from completion rate, and the students are still beholden to this debt load! As Bryan Caplan and others have pointed out, the majority of the value of an undergraduate degree is in the last year and actually receiving the diploma rather than averaged over 4 years.

The DOE says:

"At institutions where large numbers of students withdraw before completion, a lower median debt level could simply reflect the lack of time that a typical student spends at the institution. Therefore, the Department uses the typical debt level for students who complete (GRAD_DEBT_MDN_SUPP or GRAD_DEBT_MDN10YR_SUPP for the debt level expressed in monthly payments) on the consumer website. Additionally, this measure can be placed in context by looking at the borrowing rate of students at the institution (FTFTPCTFLOAN; see above); at institutions where few students borrow, the numbers may represent outliers."

For colleges, we break this down and show median debt for graduates, withdrawals, and both. For individual majors, the debt is based on the loans for only those students who completed the program.

Note that the field-of-study level debt data has significant gaps — many programs have their debt figures privacy suppressed due to small sample sizes. On ranking pages, you can use the "Only show with debt data" checkbox to filter to programs where actual debt figures are available, which gives a more accurate picture of the true cost/benefit tradeoff.

Privacy

This is a sparse matrix. A lot of the data is empty to maintain student privacy. We can still get bigger trends in a lot of cases, but smaller institutions or fields will not have much data.

From the DOE:

"..Those data that do not meet reporting standards are shown as PrivacySuppressed. Note that for many elements, we have also taken additional steps to ensure data are stable from year to year and representative of a certain number of students. For many elements, data are pooled across two years of data to reduce year-over-year variability in figures (i.e. repayment rate, debt figures, earnings). Moreover, for elements that are highlighted on the consumer-facing College Scorecard, a separate version of the element is available that suppresses data for institutions with fewer than 30 students in the denominator to ensure data are as representative as possible."

Naming and Franchises

Some schools, especially for-profit organizations, have many branches spread out in different cities but they often only report one set of statistics for the entire institution. Over time, we plan to decompose these into their proper grouping. Examples include Strayer, University of Phoenix, Cortiva Institute, etc.

Just Remember..

"All models are wrong, but some are useful." -George Box