Chapter 09: Essays #
Why There Will Be No Data Science Job Titles By 2029 #
originally published in Forbes, February 4, 2019:
There will be no data science job listings in about ten years, and here is why. There are no MBA jobs in 2019, just like there are no computer science jobs. MBAs, computer science degrees, and data science degrees are degrees, not jobs. I believe companies are hiring people for data science job titles because they recognize emerging trends (cloud computing, big data, AI, machine learning), and they want to invest in them.
There is evidence to suggest this is a temporary phenomenon, though, which is a normal part of the technology hype cycle. We just passed the Peak of Inflated Expectations with data science, and we are about to enter the Trough of Disillusionment. From where I stand, the result will be that, yes, data science as a degree and capability are here to stay, but the job title is not.
The coming Trough of Disillusionment with data science job titles will be the following:
Many data science teams have not delivered results that can be measured in ROI by executives.
The excitement of AI and ML has temporarily led people to ignore the fundamental question: What does a data scientist do?
For complex data engineering tasks, you need five data engineers for each data scientist.
A recent example of a similar phenomenon demonstrates in system administrators. This job-title used to be one of the hottest jobs in IT during the pre-cloud era. In looking at Google Trends from 2004 until now, you can see how active directory, a key skill for systems administrators, has swapped positions with AWS. In a recent article by job site Dice, it mentions several tech jobs in danger of becoming extinct. One of the prominent jobs going extinct is Windows/Linux/Unix systems administrators. Those positions reduce by substitution with Cloud, DevOps tools, and DevOps engineers. I believe something similar will happen to data science job titles. The role of a data scientist will change into something else.
Does this mean data science is the wrong degree to get? I believe it will be a significant degree in the next ten years, but it will not be a job title. Instead, there will be an evolution. The takeaway for data scientists is to look toward improving their skills in things that are not automatable:
- Communication skills
- Applied domain expertise
- Creating revenue and business value
- Know how to build things
Some future job titles that may take a data scientist’s place include machine learning engineer, data engineer, AI wrangler, AI communicator, AI product manager, and AI architect. The only sure thing is change, and changes are coming to data science. One way to be on top of this trend is to invest in data science, machine learning, and Cloud Computing skills and embrace soft skills. Another way is to think about this dilemma is to look at tasks that can be easily automated. These tasks include feature engineering, exploratory data analysis, trivial modeling. Instead, work on more challenging to automate tasks, like producing a machine learning system that increases critical business metrics and creates revenue.
Companies that want to be ahead of the curve can embrace the pragmatism and automation of machine learning. Becoming early adopters of cloud or third-party software solutions that automate machine learning tasks is a clear strategic advantage in 2019 and beyond.
Exploiting The Unbundling Of Education #
originally published in Forbes, December 26, 2019
Four out of 10 recent college grads in 2019 are in jobs that didn’t require their college degree. Student loan debt is at an all-time high, as it has climbed to more than $1.5 trillion this year. Meanwhile, the unemployment rate is hovering at 3.6%. What is happening? There is a jobs and education mismatch.
Satya Nadella, CEO at Microsoft, recently said that every company is now a software company. Where does software run? It runs on the cloud. So what does the cloud jobs market look like? It looks terrifying. A recent survey showed that 94% of organizations have trouble finding cloud talent. Jamie Dimon, chairman and CEO of Chase, says, “Major employers are investing in their workers and communities because they know it is the only way to be successful over the long term.”
Let’s break down the state of education and jobs for the majority of the market:
Education is too expensive.
Students are getting degrees in fields that are not employable.
Even degrees in relevant fields (computer science, data science, etc.) are struggling to keep up with the rapidly changing jobs market’s current needs.
Many professionals don’t have the time or money to attend traditional higher education
Alternative education formats are emerging (unbundling).
Even traditional higher education programs are supplementing “go to market” resources straight from the industry, like certifications.
What is unbundling? James Barksdale, the former CEO of Netscape, said that “there are only two ways I know of to make money: bundling and unbundling.” Education has been bundling for a long time, and it may be at a peak.
The cable television industry bundled for quite some time, adding more and more content and increasing the bill to the point that many consumers paid $100-$200 a month for TV. In many cases, consumers only wanted a specific show or network, like HBO, but it wasn’t an option. Fast forward to today, and we may be at a peak unbundling of streaming content with choices like Disney+, Netflix, Amazon, and more.
What are the similar offerings in education? Cloud computing providers create their own education channels. It is possible to study cloud computing in high school, get certified on a cloud platform, and jump right into a six-figure salary – and the cost is virtually zero. The education content and these cloud computing credits are free and maintained by cloud vendors. This process is a one-to-one match with education for jobs. It is free but not provided by traditional higher education.
Likewise, many massive open online courses (MOOCs) are providing smaller bundles of education, equivalent to, say, an HBO subscription to a premium slice of schooling. These formats offer both free and paid versions of the subscription. In a paid version, the produced components are often a narrow slice of services that a traditional college offers: career counseling, peer mentoring, and job placement.
At the elite level, the top 20% of universities, the current bundle makes a lot of sense. The economies of scale create compelling offerings. There may be dramatic changes underway for the bottom 80% of universities, similar to what has occurred with brick-and-mortar retail.
Let’s get to the exploit part now. There is a crisis in matching current job skills to qualified applications. Whether in high school or with 20 years of experience, self-motivated learners can use free or low-cost learning solutions to upskill into the jobs of the future. The turnaround time is much quicker than a traditional two- or four-year degree. In a year of sustained effort, it’s possible to enter many new technology fields.
For students in graduate and undergraduate programs in hot fields like data science, machine learning, computer science, etc., you can create your own “bundle.” By mixing the benefits of your institution’s economies of scale and the unbundling of elite training, you can leapfrog the competition. I have had many students that I have taught machine learning to in graduate programs thank me for recommending they add an online bundle to their current degree. It made them uniquely relevant for a position.
Likewise, there is a surprising formula for undergraduate students in liberal arts worth exploiting. In talking with hiring managers about the missing skills in data scientists, they’ve told me that communication, writing, and teamwork are most desirable. English, communication, and art majors can and should grab a tech bundle and add it to their degree.
Here are a couple of ideas for what to look for in a tech bundle: Does it have an active or passive platform? A dynamic platform allows you to consume the content (videos and books) and write code or solve problems against it. Is there a community associate with the product? Community-driven platforms allow you to associate with mentors and share your achievements on social media.
The days of getting a degree and expecting a job for life are over. Instead, lifelong learning, using a variety of tools and services, in the future. The good news is this is a lot of fun. Welcome to the brave new world of education unbundling.
How Vertically Integrated AI Stacks Will Affect IT Organizations #
originally published in Forbes, November 2, 2018
When was the last time the CPU clock speed in your laptop got faster? When was the last time it was cool to doubt the cloud? The answer is: around the time the vertically integrated AI stack started to get some serious traction. You might be asking yourself, “What is a vertically integrated AI stack?” The short answer is that there isn’t a perfect definition, but there are a few good starting points to add some clarity to the discussion.
Some of the popular raw ingredients to create a vertically integrated AI stack are data, hardware, machine learning framework, and the cloud platform.
According to the University of California, Berkeley Professor David Patterson, Moore’s Law is over. He states that “single processor performance per year only grew 3%.” At this rate, instead of processor performance doubling every 18 months, it will double every 20 years. This roadblock in CPU performance has opened up other opportunities, namely in chips designed to run AI-specific workloads. Application-specific integrated circuits (ASICs) are microchips designed for particular tasks.
The most commonly known ASIC is a graphics processing unit (GPU), which was historically used just for graphics. Today GPUs are the preferred hardware to do massively parallel computing like deep learning. Several converging factors have created a unique technology period: maturation of cloud computing technology, a bottleneck in CPU improvements, and current advances in artificial intelligence and machine learning. David Kirk, author of Programming Massively Parallel Processors: A Hands-on Approach, notes that “the CPU design optimizes for sequential code performance.” Yet, he goes on to state that “GPUs are parallel, throughput-oriented computing engines.”
How does all this relate again to a vertically integrated AI stack? The Google Cloud is a great case study to clarify what vertically integrated AI means. The software often used on the Google Cloud is TensorFlow, which happens to be one of the most popular frameworks for deep learning. It can do training using CPUs or GPUs. It can use TPUs (TensorFlow processing units), which Google created specifically to compete with GPUs and run the TensorFlow software more efficiently.
To perform deep learning, a machine learning technique that uses neural networks, there needs to be both a large amount of data and a large number of computing resources. The cloud then becomes the ideal environment to do deep learning. Training data lives on the cloud storage, and GPUs or TPUs are used to run deep learning software to train a model to perform a prediction (i.e., inference).
In addition to the storage, compute, and framework, Google also offers managed and automatic cloud platform software like AutoML, which allows a user to train deep learning models by uploading data and not performing any coding.
The final vertical integration is edge-based devices like mobile phones and IoT devices. These devices can then run a “light” version of the TPU where a previously trained machine learning model distributes to the device, and the edge-based TPU can perform inference (i.e., predictions). At this point, Google completely controls a vertical AI stack.
Google is not the only technology company thinking in terms of vertically integrated AI. Apple has a different flavor of this idea. Apple creates mobile and laptop hardware, and they have also made their ASIC called A11/A12/A13/A14. The A chips also are explicitly designed to run deep learning algorithms on edge, like in a self-driving car, a market Apple plans to enter. Additionally, the XCode development environment allows for integrating CoreML code where trained deep learning models can convert to run on an iPhone and serve out predictions on the A12 chip. Because Apple also controls the App Store, they have complete vertical integration of AI software to the consumers that use their devices. Additionally, the AI code can run locally on the iOS device and serve out predictions without sending back data to the cloud.
These are just two examples of vertically integrated AI strategies, but many other companies are working toward this strategy, including Oracle, Amazon, and Microsoft. Gartner predicts that AI will be in every software application by 2020. For any IT organization, the question then becomes not if it will use AI, but when and how. A company should also consider how vertically integrated AI fits into its strategy and its suppliers to execute its AI strategy.
A practical method of implementing AI into any company would be to start with a “lean AI” approach. Identify use cases where AI could help solve a business problem. Customer feedback and image classification are two common examples. Next, identify an AI solutions vendor that provides off-the-shelf solutions for these problems via API and lower-level tools that allow for more customization. Finally, start with the AI API, and create a minimal solution that enhances the current product. Then push that out to production. Measure the impact, and if it is good enough, you can stop. If it needs to improve, you can move down the stack to a more customized result.
Here Come The Notebooks #
originally published in Forbes, August 17, 2018
About five years ago, in business school, almost every MBA course used Excel. Now, as someone who teaches at business school, I’ve seen firsthand how virtually any class uses some type of notebook technology. Outside of the classroom, businesses are rapidly adopting notebook-based technologies that either replace or augment traditional spreadsheets. There are two primary flavors of notebook technology: Jupyter Notebooks and R Markdown.
Of the two notebook technologies, Jupyter has been on the faster tear as of late. Google has versions of Jupyter via Colab notebooks and DataLab. Amazon Web Services (AWS) uses Jupyter technology in Sagemaker, Kaggle uses Jupyter technology to host its data science competitions, and companies like Databricks, which is a managed Spark provider, also use Jupyter. In 2015, Project Jupyter received $6 million in funding via grants from the Gordon and Betty Moore Foundation as well as the Helmsley Charitable Trust. These funds funnel into upgrading a core technology used by all major could providers, including AWS, Google, and Azure. Each cloud provider adds their custom twist (in the case of Google, it is GCP integration, and in the case of Azure, it is F# and R integration).
Why are all of these companies using Jupyter? The biggest reason is that it just works. Another reason is that Python has become the standard for doing data science and machine learning.
Google develops one unique flavor of Jupyter Notebook, Colab notebooks,. This version is one of the newest and more compelling versions. It capitalizes on the existing Google docs ecosystem by adding the ability to open, edit, share, and run Jupyter notebooks. If you want to see what this is like, I have several Colab notebooks that cover the basics of machine learning with Python that you can explore. If this is your first experience with either Jupyter or Colab notebooks, your jaw might drop when you see what it can do.
William Gibson had a great quote: “The future is already here – it’s just not very evenly distributed.” This statement is undoubtedly the case with notebook technology. It has already disrupted the data world, and it is probably already in your company, whether you use it or not. One specific way it has interrupted the data world is that it is the default platform for doing data science projects. Essentially, if you use a cloud and do data science, there is an excellent chance you will be using Jupyter technology. Second, some of the inherent limitations of traditional spreadsheets, like dealing with extensive data, disappear with the ability to write code that talks to cloud database technology. If a vendor provides a service in the data science ecosystem, they are likely to have an example in Jupyter and a workflow in Jupyter.
One of the powerful side effects of this disruption is how it puts machine learning and data science directly into business professionals' hands. Many technology disruptions have an impact of displacing workers, but this disruption is very empowering. Within a few years (or less), it wouldn’t be a stretch to say that interactive data science notebooks will be as pervasive or more pervasive than traditional spreadsheets.
As I mentioned earlier, there is another flavor of R Markdown notebooks, which is a markdown-based notebook technology used heavily by the R community. R Markdown has many compelling features, including output to many formats like Shiny interactive dashboards, PDF, and even Microsoft Word. Shiny is an interactive dashboard technology created using only the R language and is becoming a popular choice for interactive data visualization.
One of the best ways to understand notebooks is to use them. There are several ways to get started using notebooks in the workplace. Perhaps the easiest way to get started in your job is to create a shared Google Colab notebook. Many business professionals are familiar with using a shared Google Document. A similar style can apply to a share Colab notebook. First, open up a Colab notebook starting from the welcome page, change it, then share it with a co-worker or two.
In about 30 seconds, you can be using the same workflow as most data scientists around the world.
Cloud Native Machine Learning And AI #
originally published in Forbes, July 5, 2018
The cloud has been a disruptive force that has touched every industry in the last decade. Not all cloud technology is the same, though. There are cloud-native technologies like serverless and cloud-legacy technologies like relational databases, which were first proposed in 1970. One easy way to note this distinction is to think about the cloud as the new operating system. Before there was the cloud, you had to write your Windows, Mac, or some Unix/Linux flavor application. After the cloud, you could either port your legacy application and legacy technologies or design your application natively for the cloud.
An emerging cloud-native trend is the rise of serverless technologies. Recent examples of serverless technologies include Amazon Web Services (AWS), Lambda, Google Cloud Functions, IBM Cloud Functions/OpenWhisk, and Microsoft Azure Functions. Coupled with the rise of serverless technologies is the emergence of managed machine learning platforms like AWS SageMaker. The typical workflow of SageMaker is to provision a Jupyter Notebook and explore and visualize the data sets hosted inside the AWS ecosystem. There is built-in support for single-click training of petabyte-scale data, and the model tunes automatically. Deployment of this model can also work via a single click. Then you can auto-scaling clusters and use support for built-in A/B testing.
The next innovation cycle in machine learning is the emergence of higher-level technologies that can exploit the native capabilities of Cloud Computing. In this past decade, it was still widespread to think about needing to add more storage to your application by driving to a data center and inserting physical disks into a rack of machines. Many companies are doing the equivalent of this in developing production machine learning systems. These legacy workflows will similarly become obsolete as higher levels of abstraction and automation become available. Taking advantage of cloud-native machine learning platforms could result in significant competitive advantages for organizations that pivot in this direction. Machine learning (ML) feedback loops that may have taken months now work in hours or minutes. For many companies, this will revolutionize how they use machine learning.
Fortunately, it is easy for companies to take advantage of cloud-native ML platforms. All major cloud platforms have a free tier equivalent where any team member can sign up for an account on their own and create a pilot project. In AWS Sagemaker, many example notebooks run as is or with small modifications to use company data. Often it is difficult to get initial buy-in in a large organization to use disruptive technology. One way around this is to solve it with your credit card. When you finish prototyping a solution, you can then present the larger organization’s results without getting stuck asking for permission.
Another similar trend is the use of off-the-shelf artificial intelligence (AI) APIs combined with serverless application frameworks. Complex problems solve by leveraging cloud vendor APIs for natural-language processing, image classification, video analysis, and other cognitive services. Combining these technologies with serverless architectures allows for richer applications and faster development life cycles than ones created by smaller teams. A straightforward example of this is the following workflow.
A user uploads an image and stores it in cloud storage, where a cloud function is waiting for storage events. That cloud function then uses an image classification API to determine the objects' content in the image. Those labels (person, cat, car, etc.) are then stored in a serverless database and fed back to a machine learning system that uses that data to update an existing personalization API in real-time. The model updates in real-time because it uses next-generation machine learning technologies that support incrementally updated model training and deployment.
There are many other ways to combine off-the-shelf AI APIs with serverless application frameworks. All major cloud vendors are making significant progress in producing cognitive and AI APIs. These APIs can perform semantic analysis, recommendations, image and video processing and classification, and more. These APIs are a natural fit for serverless architectures, and straightforward solutions code up in hours by combining a few Python functions. Some of the business settings these solutions can apply to include: quick and dirty prototypes by the internal data science team, external social media reputation monitoring, and internal hack-a-thons.
In phase one of the cloud era, many data center technologies bolt onto the cloud. In phase two of the cloud area, cloud-native applications will unleash powerful new capabilities that can be performed by smaller teams. In particular, there are exciting new opportunities available for data science generalists who can harness these new machine learning and AI toolsets and ultimately control the stacks as full-stack data scientists. They will perform the entire life cycle of data science – from problem definition to data collection, to model training, to production deployment.
One Million Trained by 2021 #
Disruption is always easy to spot in hindsight. How did paying one million dollars for a taxi medallion make sense as a mechanism to facilitate public taxi service?
File:CALIFORNIA, SAN FRANCISCO 1996 -TAXI MEDALLION SUPPLEMENTAL LICENSEPLATE - Flickr - woody1778a.jpg
- Lower price
- Push vs. Pull (Driver comes to you)
- Predictable service
- Habit building feedback loop
- Async by design
- Digital vs. analog
- Non-linear workflow
Current State of Higher Education That Will Be Disrupted #
A similar disruption is underway with education. The student debt is at an all-time high with a linear growth rate from 2008, according to Experience.
Simultaneous to that disturbing trend is an equally troubling statistic that 4/10 college grads in 2019 were in a job that didn’t require their degree.
This process is not sustainable. Student debt cannot continue to grow every year, and at the same time, produce almost half of the outcomes not lead directly to a job. Why shouldn’t a student spend four years learning a hobby like personal fitness, sports, music, or culinary arts instead if the outcome is the same? At least in that situation, they would be in debt and have a fun hobby they could use for the rest of their life?
In the book Zero to One, Peter Thiel mentions the 10X rule. He states that a company will need to be ten times better than its closest competitor to succeed. Could a product or service be 10X better than traditional education? Yes, it could.
Ten Times Better Education #
So what would a 10x education system look like then?
Built-in Apprenticeship #
If the focus of an educational program was jobs, why not train on the job while in school?
Focus on the customer #
Much of the current higher education system focus is on faculty and faculty research. Who is paying for this? The customer, the student, is paying for this. An essential criterion for educators is publishing content in prestigious journals. There is only an indirect link to the customer.
Meanwhile, in the open market companies, like Udacity, Edx are directly giving customer goods. This training is job-specific and continuously updated at a pace much quicker than a traditional university.
The skills taught to a student can narrowly focus on getting a job outcome. The majority of students are focused on getting jobs. They are less focused on becoming a better human being. There are other outlets for this goal.
Lower time to completion #
Does a degree need to take four years to complete? It may take that long if much of the time of the degree is on non-essential tasks. Why couldn’t a degree take one year or two years?
Lower cost #
According to USNews the median four-year annual tuition is 10,116 for a Public, In-State University, 22,577 for a Public, Out-of-State University, and 36,801 for a Private university. The total cost of getting a four-year degree (adjusted for inflation) has risen unbounded since 1985.
Could a competitor offer a product that is ten times cheaper? A starting point would be to undo what happened from 1985 to 2019. If the product hasn’t improved, but the cost has tripled, this is ripe for disruption.
Async and Remote First #
Many software engineering companies have decided to become “remote first”. In other cases, companies like Twitter are moving to a distributed workforce. In building software, the output is a digital product. If the work is digital, as in instruction, the environment can be made entirely async and remote. The advantage of an async and remote first course is distribution at scale.
One of the advantages of a “remote first” environment is an organizational structure focused on the outcomes more than the location. There are tremendous disruption and waste in many software companies due to unnecessary meetings, noisy working environments, and long commutes. Many students will be heading into “remote first” work environments, and it could be a significant advantage for them to learn the skills to succeed in these environments.
Inclusion first vs Exclusion first #
Many universities publicly state how many students applied to their program and how few students were accepted. This exclusion first based approach is designed to increase demand. If the asset sold is physical, like a Malibu Beach House, then yes, the price will adjust higher based on the market. If the asset sold is digital and infinitely scalable, then exclusion doesn’t make sense.
There is no free lunch, though, and strictly boot camp style programs are not without issues. In particular, curriculum quality and teaching quality shouldn’t be an afterthought.
Non-linear vs serial #
Before digital technology, many tasks were continuing operations. A good example is television editing technology. I worked as an editor for ABC Network in the 1990s. You needed physical tapes to edit. Soon the videos became data on a hard drive, and it opened up many new editing techniques.
Likewise, with education, there is no reason for an enforced schedule to learn something. Async opens up the possibility of many new ways to learn. Mom could study at night; existing employees could learn on the weekend or during their lunch breaks.
Life-long learning: Permanent access to content for alumni with continuous upskill path #
Educational institutions should rethink going “remote first” because it would allow for the creation of courses offered to alumni (for zero cost or a SaaS fee). SaaS could serve as a method of protection against the onslaught of competitors coming. Many industries require constant upskilling. A good example is the technology industry.
It would be safe to say that any technology worker needs to learn a new skill every six months. The current education product doesn’t account for this. Why wouldn’t alumni be given a chance to learn the material and gain certification on this material? Enhanced alumni could lead to an even better brand.
Regional Job Market that Will Be Disrupted #
As a former Bay Area software engineer and homeowner, I don’t see any future advantage of living in the region at the current cost structure. The high cost of living in hypergrowth regions causes many cascading issues: homelessness, increased commute times, dramatically lowered quality of life, and more.
Where this is a crisis, there is an opportunity. Many companies realize that whatever benefit lies in an ultra-high cost region, it isn’t worth it. Instead, regional centers with excellent infrastructure and low cost of living have a massive opportunity. Some of the characteristics of regions that can act as job growth hubs include access to universities, access to transportation, low housing cost, and good government policies toward growth.
An excellent example of a region like this is Tennessee. They have free associate degree programs as well as access to many top universities, low cost of living, and location to top research institutes like [Oak Ridge National Lab](https://www.ornl.gov/. These regions can dramatically disrupt the status quo, especially if they embrace remote first and async education and workforces.
Disruption of Hiring Process #
The hiring process in the United States is ready for disruption. It is easily disruptable because of a focus on direct, synchronous actions. The diagram below shows how it would be possible to disrupt hiring by eliminating all interviews and replacing them with automatic recruitment of individuals with suitable certifications. This process is a classic distributed programming problem that can fix a bottleneck by moving tasks from a serial and “buggy” workflow to a fully distributed workflow. Companies should “pull” from the workers versus the workers continually pulling on a locked up resource in futile work.
Why learn to cloud is different than learning to code #
One reason why cloud computing is such an important skill is that it simplifies many software engineering aspects. One of the issues it simplifies is building solutions. Many solutions often involve a YAML format file and a little bit of Python code.
This chapter ties up many ideas that have formed in my head, from teaching at the top universities in the world and working in the Bay Area in startups. These essays were all written before Covid-19. Now with Covid-19, many of the more fanciful ideas seem like strong trends.