How to Capture Keypoints With Ease: A Beginner's Guide for Data Annotation

Caroline Lasorsa

Caroline Lasorsa

Product Marketing · 2022/5/11 · 7 min read

In machine learning, your labeled data is the foundation of your model, so it’s incredibly important to choose the right technique, establish a QA process, and follow through with each testing phase. As technology continues to advance faster than in previous decades, the prominent use of AI in instances such as security and facial recognition is becoming more commonplace. When we use our faces to unlock our phones, for example, Apple went through many rounds of testing to get it right–and they used detailed keypoints to get there.

Keypoints: What are they?

In data labeling, the strategies machine learning engineers use to build out their model is entirely dependent on its use-case. And much of what we do in artificial intelligence intersects humans and machines, especially in cases that detect human movement or emotions. In these instances, we’re likely to annotate our labels using keypoints–or dot-like markings that reference important parts of your label. Think of a human face. A machine would want to understand the whereabouts of the eyes, nose, mouth, etc. to be able to recognize it as human; this is where you would find your keypoints.

The purpose of using keypoint annotation as opposed to other labeling types is to make a general and skeletal outline of your objects, which is why humans and animals are commonly labeled in this way. By outlining the major parts of an image, your model learns to detect outlines and places of movement. For example, you’ll notice that when outlining the shape of a human, much of the keypoints will be located at major joint areas such as the knuckles in a hand.

Why Keypoints?

Keypoint annotation is best used in instances where your model is tracking movements, meaning that it’s commonly used in video annotation models. Using keypoints to analyze movements helps your model detect certain injuries, analyze weaknesses, and track facial expressions.

Facial Recognition

Now more than ever, security breaches are at an all-time high, meaning that the prevention measures companies take must be highly developed and secure. Such as the common, everyday use of our smartphones, facial recognition and detection help add a bonus layer of protection from hackers and other security threats. Underneath the seemingly simple act of looking at our phone screens to unlock it is a complex and robust dataset that has to be trained. And to make it happen, engineers at Apple likely made use of thousands of keypoints before putting this feature into production.

When building a facial recognition model, engineers look at keypoint placements to measure important distances, such as the space between your eyes and the placement of your nose or forehead. Analyzing those keypoints, the model can then learn the details of a person’s face. After looking at hundreds of thousands of faces, your model will start to detect a pattern.

Beyond our Smartphones

Outside of using our faces to prevent nosy people from reading our text messages, facial recognition has other practical use cases. In U.S. airports, for example, the Department of Homeland Security uses facial recognition software to detect those who have outstayed their visas or are here illegally. One notable company developing facial recognition solutions on both the federal and local level is Clearview AI. Using open source data from across the web, they’ve surpassed the abilities of traditional criminal databases and have stopped major operations such as human trafficking rings in Los Vegas and drug smuggling attempts on the US/Mexico border.  By compiling mugshots and comparing them to surveillance footage of various infractions, social media posts, or to the faces in a database, it’s becoming much harder to get away with criminal activity. These practices are also used at large events and gatherings to identify people associated with past crimes. Having technology on our side, major threats are often prevented.

Outside of law enforcement, facial recognition is used in other applications such as opening online banking applications and approving transactions. Some car companies such as Ford are testing the waters with developing facial recognition software as an added layer of security and theft prevention. Step into a car that isn’t yours? It won’t identify who’s behind the wheel and therefore won’t start.


Facial detection and recognition software is complex and requires tons of data to build and ensure accuracy. Any mistake in the model, and it might fail to identify a person of interest, or it could lead to a wrongful arrest. Accurate models must consider edge cases. Distant photos of a person’s face, or ones that only show profile view are apt to be prone to error. In addition, many people worry about the security of having their faces stored in a database. What if those photos, that data, and those keypoints fall into the wrong hands? All of these factors must be taken into consideration by machine learning professionals.

Movement Detection and Sports

It seems unlikely that AI has any place in sports, but professional athletics is finding a need for keypoint annotation. Using AI technology, some organizations are analyzing player movement, taking note of performance improvements that are otherwise not noticeable to the naked eye. In addition, a slight change in muscle movement may indicate a looming injury. Detecting it before it happens aids in prevention and has the potential to lengthen a player’s career.

For coaches, using AI models to analyze a player’s strengths and weaknesses also helps with recruiting and assessment. Training a model with keypoint annotation has the ability to predict a player’s movements and understand their level of skill. This data is then stored and evaluated against other players to paint a fair picture of skill level across candidates. Coaches can assess a player’s strongest attributes and compare them with the rest of the team or to that of other potential players, getting an overall summary of what a team may look like. And it helps find players that have yet to prove themselves in the big leagues.

Second Spectrum has done exactly this. As the leading advanced player tracking system in both the NBA and Premier League, they’ve developed a way to analyze player performance, better predict game outcomes, and search for any play within seconds. Second Spectrum has used AI technology to apply math to sports, compiling statistical information to develop reports and help analyze the game. It’s not just about jump shots anymore; we can now see in a numerical sense how these plays fit into the overall game and fine-tune a team’s strategy to help make better decisions and make the playoffs.

Everyday Exercise

Beyond professional sports, keypoint annotation and analysis has played a big role in virtual exercise apps and assistance platforms. Analyzing a person’s movements, knowing which technique is correct, and understanding how the joints are meant to rotate help provide feedback for everyday fitness buffs. Bodybuilding, for example, is easy to mess up. Incorrect techniques will utilize the wrong muscles and will often lead to injuries.

With the pandemic we saw a huge upswing in home exercise, from pre recorded videos on YouTube to more complex systems like Tonal and Mirror. The latter works by applying computer vision technology to 17 major keypoints on the human body. Here, a virtual instructor can understand your progress, detect fatigue, and even act as a spotter during weight training sessions. Having virtual assistance as part of your everyday workout reduces the likelihood of injury, helps develop a personalized routine, and provides the convenience of a personal trainer without needing to leave the house.

Gait Analysis

Keypoint annotation and movement analysis is bigger than sports. In the medical community, movement analysis through keypoint annotation models can reveal a lot about a patient’s health, especially by looking at their gait. Scientists are using what they know about gait measurements and analysis to derive information about various diseases and injuries, and they’re using machine learning to expand their knowledge and prescribe treatment.

By capturing video footage of patients with conditions such as stroke, Parkinson’s, and cerebral palsy, medical professionals are able to annotate major joint regions in a frame-by-frame analysis of patients. By applying the concept of transfer learning to their data and implementing pre-trained models through deep learning, scientists are able to measure various parameters, such as walking speed, cadence, swing and stance time, and double support time. Using their findings, they were then able to compare gait measurements with those of healthy individuals in pretrained models. Because healthy people tend to have similar gaits, it’s easy to get an overall idea of what these measurements should look like.

Gait analysis and measurement provides medical professionals with an inside look at a person’s movement patterns and can determine the source of injury or pain. It can also identify skeletal misalignments and ascertain the progress of certain degenerative diseases. Understanding a person’s gait can help draft treatment plans and prescribe certain tests.

Hiccups and Hurdles

Conducting gait analysis on patients with movement issues often requires the outside help of assistive technologies such as braces and walkers and/or doctors and nurses. Having these extra factors can skew the dataset, so it’s important to ensure that your model is able to properly calculate the results and exclude outside influences. In addition, video footage taken with less sophisticated and handheld cameras must include different angles of footage in order to collect useful and accurate data.

Machine learning applications for gait analysis are still a budding practice. While open-source datasets are available, using keypoint annotation to explore the possibilities is still new. It takes a lot of time and thousands of data points to put this into widespread use. What’s more, it’s expensive to implement and not every medical facility or treatment center has the resources to build robust machine learning practices.


Keypoint annotation has many practical use cases for everyday applications such as facial recognition and security. Being able to recognize a person of interest and prevent further crimes from happening can save lives and prevent catastrophic events. On the other hand, knowing that your face is kept in a database alongside thousands of others has some feeling uneasy. And it’s understandable why: without proper security protocols, that data can get leaked and fall into the wrong hands. It’s paramount that security is at the forefront of these operations so that the benefits of artificial intelligence and facial recognition outweigh the risks.

Having the ability to recognize not only faces but also movements has paved the way in sports medicine. Some athletic organizations are using this technology to analyze movements, adjust training techniques, and prevent injuries. In the medical world, gait analysis and movement studies are paving the way for new types of diagnostic tools and treatment plans. For physical therapists, approaching their patients’ ailments through scientific and artificial intelligence can lead to faster recoveries.

Keypoint annotation has the potential to open many doors in law enforcement, security, and movement analysis. Using tools like the Superb AI Suite, ML professionals can get started on some of these groundbreaking projects. Schedule a call with our sales team today.