How LinkedIn is moving towards a skills-based economy with the Skills Graph
What is a skills-based economy and how is LinkedIn moving from vision to implementation? As LinkedIn Director of Engineering Sofus Macskássy shares, there’s AI, taxonomy, and ontology involved in building the Skills Graph that powers it.
Skills are the new currency. That’s a bold statement, coming from LinkedIn CEO Ryan Roslansky. Roslansky makes the case for the so-called skills-first economy based on both anecdotal evidence and data. Skills-first hiring was mentioned in the 2022 State of the Union address and a growing number of CEOs are calling on the need for companies to shift how they hire.
In addition, recent LinkedIn data shows that the skills sets for jobs have changed by around 25% since 2015. By 2027, this number is expected to double. That means jobs are changing on you even if you aren’t changing jobs, just as business demands are changing on you even if you’re not changing your business.
That was not the first or the last time Roslansky made that point. In 2021, LinkedIn’s CEO outlined a vision to help transition the hiring market from focusing solely on titles and companies, degrees and schools to also focusing on skills and abilities. In his 2021 post, Roslansky announced new LinkedIn features and services. He also referred to AI-driven changes, a prescient point brought forward again in 2023.
Roslansky is not alone in identifying this shift. In 2018, we posited that in a rapidly shifting job market, being able to formalize skills is a requirement for job seekers and employers going forward. In 2019, we followed that up with more analysis on the relationship between the future of work, skills, and knowledge graphs.
In 2021, Roslansky first referred to the LinkedIn Skills Graph. The Skills Graph was introduced to help create a common skills language, and it’s powering a multitude of LinkedIn features and services as well as Microsoft Viva. In 2022, LinkedIn Director of Engineering Sofus Macskássy elaborated how LinkedIn’s Skills Graph is being built to power a skills-first world.
Further details on building and maintaining the skills taxonomy that powers LinkedIn’s Skills Graph were shared earlier in 2023. Today, Macskássy and his team are sharing more details on how they are extracting skills from content to fuel the LinkedIn Skills Graph. We caught up with Macskássy to discuss the journey from vision to implementation.
LinkedIn skills taxonomy: Helping everyone speak the same “skills language”
As Macskássy pointed out, the story of skills on LinkedIn goes way back. LinkedIn users have long been able to add skills to their profiles and endorse each other. LinkedIn understood that skills represent an important vocabulary by which people communicate and understand each other. But at some point the realization settled in that this goes beyond just finding a job or finding talent.
When the CEO highlighted LinkedIn should move into a skills-first world, there was a question – what does that even mean? From a technical perspective, the team needed to understand what the vision was.
As the team started digging into the various product lines as well as the news feed, advertising, recommendation and search, they realized that skills could be leveraged as signals to aid ranking and recommendation. But going from vision to implementation has not been without obstacles.
Many people didn’t have skills listed on their profiles. Even when they did, people did not necessarily use the same vocabulary. It was not easy to figure out whether two skills were related to each other or even synonyms of each other. This is what Roslansky’s mention to helping everyone speak the same “skills language” alluded to.
To resolve ambiguity, the first approach was to build out the LinkedIn skills taxonomy. The skills taxonomy is where LinkedIn organizes and categorizes skills based on their hierarchical relationships to each other.
Each skill is represented by a “node” in the skills taxonomy and nodes are linked together to form a hierarchical skill network through “edges” called knowledge lineages. Knowledge lineages reflect how two skills relate to each other. Skills may relate for various reasons, such as the skills are both part of a career specialization or one skill is for a tool that is used to apply another skill.
To create a stronger network of connected skills, a framework called “Structured Skills” is utilized. This framework increases understanding of every skill by mapping the relationships it has to other skills around it.
Building the Skills Graph with AI and humans in the loop
The connected skills taxonomy is curated by a combination of human taxonomists and machine learning. This Human-In-The-Loop approach to building the Skills Graph helps grow the taxonomy at scale while ensuring the skills data meets required quality and standards.
“It is a feedback loop where we see new skills and new ways of mentioning skills across the board. We use this to expand the Skills Graph dynamically as we see new ways of phasing a skill or even a new skill we never had seen before.
For example, prompt engineering is now a new skill that really popped up over the last couple of quarters. So we want to dynamically add it to the Skills Graph. We then use these new skills that can subsequently be tagged in the content”, Macskássy said.
That is a continuous process and the cadence by which it’s done depends on the type of content. The skill harvesting process happens on demand as content is updated. Potential additions or improvements to the Skills Graph are batched, so reviews and updates don’t have to be executed every time any single update happens.
In 2022, LinkedIn’s taxonomy consists of over 39,000 skills spanning 26 languages, over 374,000 aliases (different ways to refer to the same skill – e.g., “data analysis” and “data analytics”), and more than 200,000 links between skills. In 2023, there are more than 41,000 skills.
Instead of relying only on taxonomists to manually curate over 41k skills, LinkedIn applies machine learning techniques to help scale the taxonomy construction. This includes a tool LinkedIn developed, KGBert which was inspired by KG-Bert, a supervised model that applies deep semantic understanding of skills to predict relationship lineages.
From taxonomy to ontology
By utilizing machine learning models, LinkedIn can extract and map skills from diverse content sources and collect feedback for continuous model improvement and member value. To do this, large pieces of text (such as job descriptions and resumes) first need to be segmented out into meaningful parts.
Mentions of skills can then be removed from each piece of the text. Once extracted, they are normalized into canonical/single representations (i.e., “data analytics” and “data analysis” are the same type of skill), represented in the skills taxonomy.
Attention also has to be paid to where a skill sits in a piece of content and what type of content it is. Skills are often represented differently in resumes, member profiles, or job descriptions, so models are fine-tuned to learn the specifics of those types of content.
As former AI Division Technical Lead for Taxonomies and Ontologies at LinkedIn Mike Dillinger points out, however, taxonomies are the duct tape of connected data. They seem simple, flexible, and familiar. They are widely used. And they seem to work across many use cases and many domains.
But when looked at in more detail, taxonomies turn out to be crude tools for knowledge organization that are difficult to create, to scale, to adapt, to align, and to build on. A key reason is the fact that they are limited to modeling hierarchical relationships, which are only a small part of the rich relations connecting entities in the real world.
Some of the technical details shared on the Skills Graph alludes to harvesting and exploiting more than hierarchical relationships. Macskássy verified that LinkedIn utilizes an ontology used not only for mapping skills between branches, but also to other concepts. That has provided great improvements when it comes to ranking and recommendation, with more details to be unveiled in due course.
Explicit and implicit skills
Applications across LinkedIn include career relevant skills and job important skills. The approach has resulted in performance improvements in Job Recommendation, Job Search and Job Member Skills matching.
As part of the process, recruiter skill feedback and seeker skill feedback is collected. When a recruiter manually posts a job on LinkedIn, a list of skills, pulled by LinkedIn’s AI model, is suggested after they fill in the posting content. A recruiter can edit this list depending on if they believe a skill is important.
Similarly, when a job seeker opens a job posting on LinkedIn, a feature will show how many skills overlap between their profile and the job. Seekers can review the top 10 skills used for skill matching calculation and if a certain skill is irrelevant to the job, they can provide feedback.
Obviously, harvesting and utilizing those “implicit” skills in addition to skills explicitly listed by member is something LinkedIn has put a lot of effort into. And there are anchors to surface those implicit skills and feedback from job seekers and recruiters on both ends of the recruiting process.
Macskássy noted that there are surfaces and flows by which members can add skills. LinkedIn continues to improve those flows where suggestions are made based on member resumes and profiles. Members are asked whether they might want to add certain skills and even associate certain skills with certain of their jobs as well. However, prompting is not done in an overly aggressive way.
As for implicit skills, Macskássy said those are not necessarily surfaced because they are more fluid in nature. It makes less sense to surface all of these, particularly as LinkedIn is moving forward into how to think about skills more dynamically and more in-depth, he added. So those skills are not considered as part of member profiles and are not exported either.
Skill provenance, credibility, and depth
There is more dimensions of implicit skill harvesting that are worth highlighting: provenance, credibility, depth and interoperability. Since implicit skills are extracted from content that members themselves provide, is there a way to evaluate the weight this content carries or does it have to be taken at face value?
The content that members provide is how they they want to represent themselves, Macskássy said. So that has to be taken somewhat at face value, as it all comes down to trust. LinkedIn has not observed that members are stretching the truth, because there’s a lot of people looking at their professional profiles and they would be called out, Macskássy added.
But there is a mechanism through which the depth of member expertise can be assessed. LinkedIn Skill Assessments (SAs) are adaptive assessments designed by LinkedIn Learning experts to evaluate and validate skills across a range of domains.
These short-form assessments are accessible through the profile skills section, where members can click on the “Take skill quiz” button to access a list of SA recommendations. Upon successfully passing a SA with a 70th percentile or higher score, members are awarded a “verified skill” badge that they can display on their profile page and is visible to recruiters.
Skill assessments as well as other learning material aligns with skills in the skills taxonomy. It is represented internally in member data and can be part of ranking. The amount of SAs and certificates is still very lightweight compared to other content. LinkedIn is considering how to best elicit and leverage it, for example by evaluating member interactions.
Interoperability, Open Badges, and future work
Something that could potentially help both in terms of interoperability as well as content enrichment is Open Badges. Open Badges is a data specification initiated by Mozilla and the MacArthur Foundation. The goal is to provide a simple but flexible format for documenting and showcasing skills. A badge corresponds to a skill a person has, as recognized by a third party.
That means that by using Open Badges, LinkedIn members would be able to both import skills documented by third parties, as well as export their skills in a standardized formats. Macskássy said that the team is aware of Open Badges, however the focus is primarily on what can help with the member experience on the platform. Open Badges is on the radar, but has a low priority at this point.
LinkedIn is investing in continuously improving skill understanding capabilities in approaches such as leveraging Large Language Models. Examples include using LLMs to provide rich descriptions on skills, fine-tuning LLMs to improve skill extraction, or leveraging embedding for skill representation.
As the team notes, the LinkedIn Skills Graph is at the center of powering the skills-first transformation. The tech stack for mapping content to the Skills Graph enables constant update and evolution of the Skills Graph to stay up to date on the always-changing skills landscape.