Data scientists

AI Overlap Index
57.9 / 100
Mostly Exposed

Most of the workflow is automatable. Human judgment remains for exceptions, clients, or ambiguity.

SOC 15-2051 · Math

Bureau of Labor Statistics
Median pay
$112,590/yr
Hourly
$54/hr
Jobs 2024
245,900
Projected 2034
328,300
10-yr outlook
+34% · Much faster than average
Employment change
82,500
Entry education
Bachelor's degree
SOC code
15-2051

Signal composition

how the 0-100 score is assembled

Task Automation Impact weight 60%
65.2
contribution to AOI: 39.1
Automation Potential weight 10%
90.0
contribution to AOI: 9.0
Market Pressure weight 15%
30.0
contribution to AOI: 4.5
Entry Barrier Erosion weight 15%
35.0
contribution to AOI: 5.2

By seniority

multiplicative adjustment from category curve

Entry
72.4
mult 1.25x
Mid
57.9
mult 1.00x
Senior
43.4
mult 0.75x

Entry-level roles carry the brunt because they concentrate the most automatable subset of tasks. Senior work is insulated by judgment, relationships, and accountability.

Task-level analysis

scored 0-100 for current-generation AI feasibility, weighted by BLS-stated importance

9 tasks · model: claude-sonnet-4-5-20250929
Important t6

Clean and structure raw data to make them readable by software programs

Data cleaning and structuring is highly automatable—AI can handle missing values, standardize formats, detect outliers, and transform schemas. This is already a strength of modern data tools augmented by LLMs that can write cleaning scripts from natural language descriptions.

BLS evidence: Data scientists must 'clean' the raw data, a process by which they structure the data to make them readable by software programs.

80
automation
Core t3

Use data visualization software to present findings as charts, maps, and graphics

AI can generate publication-ready visualizations from data using tools like code-generating LLMs that produce matplotlib/ggplot/Tableau code. The main human role is selecting which insights to highlight and ensuring visual clarity for the audience.

BLS evidence: Data scientists 'Use data visualization software to present findings' and often present their findings as charts, maps, and other graphics.

75
automation
Core t1

Collect, categorize, and analyze data from various sources

AI systems can now autonomously collect data via APIs, categorize using LLMs, and perform standard analyses. Human oversight needed mainly for defining scope and validating unusual patterns, but the bulk of routine data collection and analysis is automatable.

BLS evidence: Data scientists typically 'Collect, categorize, and analyze data' and may use various methods to obtain data including access to databases or web-scraping tools.

72
automation
Supporting t8

Develop or recommend systems and enhance web-browsing functions

AI can write code for system enhancements and web functions, suggest architectural improvements, and implement features. Human developers still needed for integration decisions and handling edge cases, but AI substantially reduces the labor content.

BLS evidence: Data scientists with strong coding or engineering backgrounds may develop or recommend systems, build machine learning algorithms, and devise ways to enhance web-browsing functions.

70
automation
Core t2

Create, validate, test, and update algorithms and models for machine learning

Modern AI can generate, test, and iterate on ML models using AutoML frameworks and code generation. However, validating model appropriateness for business context and updating models based on shifting requirements still requires substantial human judgment.

BLS evidence: Data scientists 'Create, validate, test, and update algorithms and models' and develop algorithms to support programs for machine learning.

68
automation
Supporting t9

Conduct research for reports or academic journals

AI can draft literature reviews, synthesize findings, and structure research reports. However, formulating novel research questions, designing rigorous methodologies, and ensuring academic integrity require human oversight, though AI accelerates the writing and analysis phases significantly.

BLS evidence: Some data scientists conduct research for reports or academic journals.

62
automation
Important t5

Determine which data are available and useful for the project

AI can assess data availability, quality metrics, and relevance to objectives, but determining what's truly 'useful' requires understanding unstated business priorities and making judgment calls about ROI that benefit from human domain expertise.

BLS evidence: Data scientists typically 'Determine which data are available and useful for the project' and often begin by gathering or identifying relevant data sources.

58
automation
Core t4

Make business recommendations to stakeholders based on data analysis

AI can draft recommendations and identify patterns, but translating technical findings into actionable business strategy requires understanding organizational politics, risk tolerance, and implementation constraints that AI cannot reliably navigate without human guidance.

BLS evidence: Data scientists 'Make business recommendations to stakeholders based on data analysis' to help inform business decisions or process changes.

52
automation
Important t7

Communicate analyses to technical and nontechnical audiences

AI can generate clear written explanations and adapt technical depth, but live communication involves reading the room, handling unexpected questions, building trust with skeptical stakeholders, and adjusting messaging based on subtle social cues that AI cannot yet manage.

BLS evidence: Visualization techniques allow data scientists to clearly communicate their analyses to technical and nontechnical audiences, including colleagues, managers, and clients.

48
automation

Task heatmap

automation score by task, sorted by weighted contribution

🔒

Unlock with Jobpocalypse Pro

Career pivot paths, wage impact analysis, AI tool recommendations, and task heatmaps for every occupation. $9/month, cancel anytime.

See plans

or

Downloadable PDF for this occupation only. One-time payment, yours forever.

◆ Premium insight
◆ Premium insight
◆ Premium insight

External signals and sources

category-level priors and BLS fields that feed the four non-task signals

Automation Potential
90
karpathy 9/10
  • Karpathy/BLS Digital AI Exposure (0-10 scale rescaled to 0-100)
Market Pressure
30
outlook: Much faster than average
  • BLS projected outlook: Much faster than average (34%)
  • Indeed demand signal (monthly refresh pending)
Entry Barrier Erosion
35
ed: Bachelor's degree
  • BLS typical entry-level education: Bachelor's degree
  • Credential trend signal (annual refresh)

Related in Math

closest AOI neighbors in the same category