2021 has been a rapidly growing year for Superb AI. The first big event of the year was our Series A, which allowed us to expand our Korea and US teams to accelerate our path to better building and delivering training data for computer vision applications.
Our top priorities throughout 2021 have been building a product that users love and building repeatable, predictable growth.
We wanted to find answers related to many things - the core pain points of our users, features that they love the most/least, ways of identifying actual product users and purchase decision makers, ways to approach both groups, product packaging, product prioritization, and so forth.
We also wanted to be strategic about how to tread the industry landscape — who should we identify as potential partners and collaborate with?
Building The Product
At Superb AI, we are obsessed with solving the pressing data labeling problem for the computer vision industry. In 2020, we launched an auto-labeling feature based on a pre-trained model that accurately detects up to 100+ general object classes. Then, we built and layered on top our Uncertainty Estimation AI to measure the uncertainty of each auto-labeled annotation to speed up active learning workflows.
As we invest more time speaking to more CV practitioners across the industry without bias for the domain, the company size, or the operational maturity, we are bullish that they need even more advanced automation and agile operations for their data labeling workflows in diverse application scenarios. This year:
1. In Q1, we doubled down on improving Custom Auto-Label, our powerful tool that can drastically reduce the time it takes to build and iterate on datasets. We also enabled integration with AWS S3 and Google Cloud Platform for data storage, as well as the ability for our users to download raw data.
2. In Q2, we developed keypoint auto-label, improved Suite’s UI/UX, and added SDK improvements and examples.
3. In Q3, we released the Manual QA set of features built to streamline the label validation workflow so you can consistently collect high-quality labels without significant efforts. We also provided performance metrics on Custom Auto-Label and added multi-polygon segmentation features.
4. In Q4, we reduced human-in-the-loop by enabling classification support, control over train/validation split for Custom Auto-Label, and refreshed auto-label UI. Additionally, we curated open datasets that can be used as ground truth for various computer vision tasks so practitioners can visually explore them for their use cases.
Our product roadmap for 2022 is to provide more advanced labeling capabilities (such as consensus labeling), more intuitive labeling UI, and more support for computer vision tasks and data formats. We will also refactor our codebase for enhanced search performance, improved bulk operations, and near-real-time analytics.
Although data labeling has been our bread and butter, we have started to form a thesis around building the DataOps platform for the modern computer vision stack in a data-centric AI development world. We have explored DataOps principles from the data analytics world, examined specific data challenges in enterprise computer vision, and considered the organizational structure needed to take advantage of the valuable data work. We will work on new DataOps capabilities such as image and video search, metadata analysis, data pre-processing, mislabel detection and debugging, workflow automation, data versioning and branching for data engineers, and more. Stay tuned for these updates!
Building Industry Partnerships
Earlier this year, we joined the AI Infrastructure Alliance to build a Canonical Stack for AI. The hope is to drive strong engineering standards while creating seamless technology and product integrations between the many layers that exist in the AI infrastructure ecosystem. One of the first big projects of the Alliance has been working on blueprints to help people get a better grip on a real-world enterprise AI/ML workflow.
As seen in the tech stack diagram above, Superb AI neatly covers the data engineering engine - which includes data ingestion, data transformation, and data labeling stages. From this perspective, we have been thinking about AIIA partners that cover other phases of this stack to collaborate and integrate with:
1. Our friend Valohai provides a powerful MLOps platform that automates the not-so-fun parts of ML, such as pipeline orchestration and model deployment. Take a look at the workflow combining our two platforms here.
2. Our friend Pachyderm provides the data foundation for ML with powerful data versioning and data pipelining capabilities. Take a look at our product integration to improve data-centric operations here.
3. Our friend WhyLabs provides the critical missing component of AI observability in production ML systems by monitoring ML deployments. Take a look at our shared workflow to enable reliable data operations here.
4. Our friend Arize AI provides a robust ML observability platform that enables ML teams to quickly detect where issues emerge and deeply troubleshoot the reasons behind them. Take a look at our shared mindsets on ensuring high-quality structured and unstructured data here.
Our goal for 2022 is to continue to deeper these partnerships with further product integration and joint marketing efforts. We are also exploring more partnerships with other startups in the data and AI ecosystems, such as model experimentation engines, data science notebooks, feature stores, and more.
We would like to take the opportunity to thank our amazing team for this journey thus far. Superb AI is not only about a more efficient way to manage training data for real-world ML systems. It’s also about a better way to collaborate with data and ML practitioners in a fast-moving culture of teamwork and accountability. The same spirit we feel within our community of users and partners who continuously assist us in making Superb AI the best ML data management platform. Thank you!