Data scientists
Most of the workflow is automatable. Human judgment remains for exceptions, clients, or ambiguity.
SOC 15-2051 · Math
Signal composition
how the 0-100 score is assembled
By seniority
multiplicative adjustment from category curve
Entry-level roles carry the brunt because they concentrate the most automatable subset of tasks. Senior work is insulated by judgment, relationships, and accountability.
Task-level analysis
scored 0-100 for current-generation AI feasibility, weighted by BLS-stated importance
Clean and structure raw data to make them readable by software programs
Data cleaning and structuring is highly automatable—AI can handle missing values, standardize formats, detect outliers, and transform schemas. This is already a strength of modern data tools augmented by LLMs that can write cleaning scripts from natural language descriptions.
BLS evidence: Data scientists must 'clean' the raw data, a process by which they structure the data to make them readable by software programs.
Use data visualization software to present findings as charts, maps, and graphics
AI can generate publication-ready visualizations from data using tools like code-generating LLMs that produce matplotlib/ggplot/Tableau code. The main human role is selecting which insights to highlight and ensuring visual clarity for the audience.
BLS evidence: Data scientists 'Use data visualization software to present findings' and often present their findings as charts, maps, and other graphics.
Collect, categorize, and analyze data from various sources
AI systems can now autonomously collect data via APIs, categorize using LLMs, and perform standard analyses. Human oversight needed mainly for defining scope and validating unusual patterns, but the bulk of routine data collection and analysis is automatable.
BLS evidence: Data scientists typically 'Collect, categorize, and analyze data' and may use various methods to obtain data including access to databases or web-scraping tools.
Develop or recommend systems and enhance web-browsing functions
AI can write code for system enhancements and web functions, suggest architectural improvements, and implement features. Human developers still needed for integration decisions and handling edge cases, but AI substantially reduces the labor content.
BLS evidence: Data scientists with strong coding or engineering backgrounds may develop or recommend systems, build machine learning algorithms, and devise ways to enhance web-browsing functions.
Create, validate, test, and update algorithms and models for machine learning
Modern AI can generate, test, and iterate on ML models using AutoML frameworks and code generation. However, validating model appropriateness for business context and updating models based on shifting requirements still requires substantial human judgment.
BLS evidence: Data scientists 'Create, validate, test, and update algorithms and models' and develop algorithms to support programs for machine learning.
Conduct research for reports or academic journals
AI can draft literature reviews, synthesize findings, and structure research reports. However, formulating novel research questions, designing rigorous methodologies, and ensuring academic integrity require human oversight, though AI accelerates the writing and analysis phases significantly.
BLS evidence: Some data scientists conduct research for reports or academic journals.
Determine which data are available and useful for the project
AI can assess data availability, quality metrics, and relevance to objectives, but determining what's truly 'useful' requires understanding unstated business priorities and making judgment calls about ROI that benefit from human domain expertise.
BLS evidence: Data scientists typically 'Determine which data are available and useful for the project' and often begin by gathering or identifying relevant data sources.
Make business recommendations to stakeholders based on data analysis
AI can draft recommendations and identify patterns, but translating technical findings into actionable business strategy requires understanding organizational politics, risk tolerance, and implementation constraints that AI cannot reliably navigate without human guidance.
BLS evidence: Data scientists 'Make business recommendations to stakeholders based on data analysis' to help inform business decisions or process changes.
Communicate analyses to technical and nontechnical audiences
AI can generate clear written explanations and adapt technical depth, but live communication involves reading the room, handling unexpected questions, building trust with skeptical stakeholders, and adjusting messaging based on subtle social cues that AI cannot yet manage.
BLS evidence: Visualization techniques allow data scientists to clearly communicate their analyses to technical and nontechnical audiences, including colleagues, managers, and clients.
Task heatmap
automation score by task, sorted by weighted contribution
Unlock with Jobpocalypse Pro
Career pivot paths, wage impact analysis, AI tool recommendations, and task heatmaps for every occupation. $9/month, cancel anytime.
See plansor
Downloadable PDF for this occupation only. One-time payment, yours forever.
External signals and sources
category-level priors and BLS fields that feed the four non-task signals
- Karpathy/BLS Digital AI Exposure (0-10 scale rescaled to 0-100)
- BLS projected outlook: Much faster than average (34%)
- Indeed demand signal (monthly refresh pending)
- BLS typical entry-level education: Bachelor's degree
- Credential trend signal (annual refresh)
Related in Math
closest AOI neighbors in the same category