It does not sound very exciting, but the past 25 years have seen an explosion of new data about jobs, wages and educational outcomes. As with any new technology, some places are using it very effectively, while others lag substantially. I’m particularly impressed with three new data sets, and the potential they offer for a much deeper understanding of jobs and schooling. The first is state-level linkage of schooling and employment data. The second is the web scraping of help wanted ads, and the third are Quarterly Workforce Indicators.
The history of these data and what they can tell us is fascinating and a bit surprising. I’ve worked as an economist at three different state universities that collected and used these data, so I have a bit of insight into their use, lack of use and misuse.
In the 1990s, the Census, the Department of Labor and Department of Education became convinced that better data about the educational and workplace experience of Americans might tell us something about school and work. Each state, not the federal government, is responsible for education; therefore, the federal government sought to convince states to put data together in ways that would support serious study.
To accomplish this, states needed to connect individual school records to job records, or more precisely, from elementary through college or workforce programs to employment. Lest anyone worry, one major challenge was ensuring that these data were completely anonymous. The goal of this work is to track experiences, not people, from school to job. For example, more than 20 years ago, I was contracted by a state commission on higher education to study labor market outcomes by degree holders from different state universities. They wanted to know if there was a difference in earnings associated with different schools.
To do this study, I built a statistical model that compared several dozen degrees, such as nursing, finance, or MBA. I accounted for student demographics, where the job was located, and for the MBA analysis, the undergraduate major. I found no statistical difference between schools, which the commission found informative. This was an internal study, done confidentially for West Virginia.
Many states use these data today to better understand the effect of education on wages, to help identify underperforming schools or college degrees, or to better evaluate the effect of particular degrees on wages and employment. Sadly, some states do almost no serious analysis of education and labor market outcomes, despite spending tens of millions of dollars per year to collect these data.
It is hard to know what internal analysis is being performed by each state. These things are often done quietly, as a roadmap for policy. Still, it is useful to see what questions little old West Virginia was asking and answering two decades ago. My hunch is that a lot of states who think themselves sophisticated data users are a few years behind West Virginia in 2003.
The second big data innovation of the past few years is the collection of online help wanted ads. This data is collected by a process known as web scraping, which is a form of artificial intelligence. This AI collects help wanted ads from variety of sources. Federally, these data are reported along with other labor market data and used by forecasters and economists to better understand changes to employment.
There are also commercial sources of these data, which report weekly ads by occupation, wages, location, educational requirements and other characteristics. I find them very useful in teasing out questions about the composition of labor demand, and specific characteristics such as new job openings for remote work, or changing educational needs in some industries. But, some caution is in order. Data itself tells us nothing; it is the analysis that matters. I’ll use Indiana data as an example, but this mistake is commonplace.
I often hear from elected leaders that there are 200,000 unfilled jobs in Indiana, and most of them were for people with high school diplomas or less education. This data comes from help wanted ads, but many firms maintain help wanted ads constantly, particularly in high turnover occupations such as retail, truck drivers, and nursing assistants.
In fact, there is no correlation between raw help wanted ads and labor demand. For example, in April 2020, the first full month of the pandemic, Indiana had 167,000 open help wanted ads. That month we actually lost 463,000 jobs. In the following month, as employment leaped back by 120,000 jobs, help wanted ads dropped to only 126,000. Turnover, not growing demand, drives almost all help wanted ads.
For example, from 1998 to 2022, job turnover among adults in Indiana suggested that firms should have advertised about 191,000 jobs each month. But, over that same period, the state only added 228,000 total jobs among adults aged 25 and older. That is about 250 help wanted ads per net new job created over the past 25 years.
The third type of data is the Census and BLS Quarterly Workforce Indicators. This data set collects dozens of different pieces of information, by industry, at the county or higher level, by industry. It allows us to examine job growth, earnings, turnover by gender, age group and education.
Piecing these data together offers the potential for deep insight into labor markets. For example, it takes two of these data sets to understand the link between job turnover and help wanted ads. And, it is this type of analysis that should help protect state workforce, economic development and education officials from costly policy mistakes.
For example if you looked at help wanted ads, you’d think there was high demand for high school graduates in Indiana, and relatively low demand for college graduates. But, from 1998 to 2022 total job growth for people who’d been to college numbered over 190,000, but for high school total job creation was -41,000. Over that time the supply of high school graduates has grown much faster than the supply of college grads.
There’s been more data created on labor and education in the past 25 years than in the preceding 25 centuries. It is accessible, rich and offers almost endless insights to folks doing deep, thoughtful study. I think states would dodge some of their most costly policy mistakes by more fully exploiting these data. That analysis would cost a tiny fraction of the public expense of collecting the data.
Michael Hicks is the George and Frances Ball Distinguished Professor of Economics and the director of the Center for Business and Economic Research at Ball State University.