Prukalpa Sankar is the Co-founder of Atlan. Atlan is a modern data collaboration workspace (like Github for engineering or Figma for design). By acting as a virtual hub for data assets ranging from tables and dashboards to models & code, Atlan enables teams to create a single source of truth for all their data assets and collaborate across the modern data stack through deep integrations with tools like Slack, BI tools, data science tools and more. As a pioneer in the space, Atlan was recognized by Gartner as a Cool Vendor in DataOps, as one of the top 3 companies globally. Prukalpa previously co-founded SocialCops, a world-leading data-for-good company recognized as a New York Times Global Visionary and a World Economic Forum Tech Pioneer. SocialCops is behind landmark data projects, including India’s National Data Platform and SDGs global monitoring in collaboration with the United Nations. Prukalpa was awarded the Economic Times Emerging Entrepreneur for the Year and recognized in multiple lists such as Forbes 30u30, Fortune 40u40, and Top 10 CNBC Young Business Women 2016.
My conversation with Prukalpa was recorded back in April 2021. Since the podcast was recorded, a lot has happened at Atlan!
Prukalpa has written more content. I’d recommend checking out:
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. Get in touch with feedback or guest suggestions by emailing khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts or click one of the links below:
If you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Here are highlights from my conversation with Prukalpa:
I grew up in India and didn’t have a hometown because my dad had a transferable job. So we lived in a bunch of different cities growing up. After high school, I got a scholarship to study in Singapore — where I studied engineering and entrepreneurship. Starting in university, I followed a very traditional path and ended up interning at Goldman Sachs. I got paid very well, but my ambitious colleagues did not love their work.
By chance, I got involved in the startup ecosystem in Singapore and built something called a Singapore Entrepreneurship Challenge — which helped me meet and interact with about 200 entrepreneurs. I enjoyed building things from the ground up. Around this time, I met and collaborated with a classmate named Varun on a couple of side projects during hackathons — some of them later became SocialCops.
SocialCops started as a random idea during a midnight brainstorm session between Varun and me. We made the decision that we were going to try a startup in our final year of university. The startup was never supposed to be SocialCops, and it was supposed to be something else. The initial idea of SocialCops was to use crowdsourced data to drive better decision-making in the social civic space. For instance, how to get crowdsourced information about broken streetlights and use that to help drive city budgets?
It’s important to think about the magnitude of this because we were scholarship students in Singapore. Moving back to India was never an option. 7 or 8 years ago, the Indian startup ecosystem was not as robust as today. So why not do a startup? We were these kids who were passionate about this thing. Then reality struck. We didn’t have any money at this point.
We launched a crowdfunding campaign. This was when Kickstarter and similar platforms started getting popular. Overnight, we created a welcome video, put it live on Kickstarter, and fell asleep at 4 AM. We woke up the following day, and it had gone wild. Random Facebook posts connected us to different people. We ended up not raising a ton of money ($600), but that experience enabled us to do customer discovery in a unique way.
Now we still had to figure out to fund ourselves. Let’s use our student card! There were business plan competitions everywhere. So we said: “Let’s take part in all of them.” We basically created a Google Sheet, put the name of every single competition worldwide, and sent out applications to all of them. We ended up raising about $25–30,000 from prize money — which was enough as seed capital for us to move back to India and spend one year figuring out SocialCops’ business model.
We were the extended data teams for our customers. Essentially, we tried to use data to solve the world’s biggest problems (national healthcare, poverty alleviation, education, etc.) We quickly realized that the best way was to partner with organizations with the most massive impact in the space. Whose decision can they drive? If they drive their decision, what kind of impact can it be? So we started working with customers like the United Nations, the Gates Foundation, and several large governments. They didn’t have any data teams, so we started acting as their internal data teams, responsible for the end-to-end implementation of their problems. That’s where I learned about building and running data teams — how complex and chaotic they can be.
The biggest challenge with data-for-good (or for most data practitioners) lies in the data. Outside of big tech companies, most data practitioner struggles and grapples with getting their leaders together in a way that they can start driving insights. This becomes even harder from a data-for-good scenario because no one would be responsible for the outcome. For instance, let’s say a kid dropped out of school. Maybe the kid did not drop out because of the education. Perhaps they needed to work in the agricultural field for money, or maybe there were no gold toilets (which goes into sanitation).
At SocialCops, we drove some of the data-for-good initiatives. For example, with the United Nations, there was an initiative to bring together data about Sustainable Development Goals from 100 different places — such that we can answer questions like how do your education goals link to your economic empowerment goals? The question is how to do this at the world’s scale and the pace that the world needs.
The data team is the most complex team that exists in the old fabric. If you want a data project to be successful, you need an analyst, an engineer, a business consultant, a machine learning researcher, etc. All these diverse roles need to come together and collaborate effectively to make the project successful. Each of them has his/her own tooling preferences and skillsets. On a day-to-day basis, that meant chaos as soon as we hit some scale.
There was one quarter in which we went from analyzing 2 million people to suddenly 500 million people. Things broke left, right, and center. It took 8 hours and 4 people to figure out why a number on the dashboard went wrong. That’s the day in the life of a data practitioner today, right? We got to the breaking point where we realized we couldn’t continue to stay like that. So, we started building an internal tool for ourselves.
Our fundamental thesis was that: If you think about all of these problems, they weren’t technology as much as human collaboration problems. That’s the lens that we took. Over a couple of years, this tool has made our data team 6 times more agile. We went on to build things like the India National Data Platform, which the Prime Minister uses himself.
What’s cool about the project was that: it was built by an 8-member team in 12 months from start to finish — probably one of the fastest of its kind. Out of the 8 members, 4 of them had never posted a line of code in production. That’s when we realized: if we make this tool available to other data teams worldwide and enable them to be twice as fast, what would that do to the world? That’s how Atlan was born.
The success of data teams comes down to creating a thriving culture in the team.
Data Catalog 3.0s will not look and feel like their predecessors in the previous generations. Instead, Data Catalog 3.0s will be built on the premise of embedded collaboration that is key in today’s modern workplace, borrowing principles from Github, Figma, Slack, Notion, Superhuman, and other modern tools that are commonplace today.
There are three different steps to deal with data quality:
That’s the broad gallons of data quality. As with everything else, there’s depth in each of these steps. Take detection, for example: Detecting missing values is pretty basic. Using ML algorithms for anomaly detection is more advanced. To answer your question about strong practices to ensure data quality, 80% of the problem can be solved with 20% of the work. I know that there’s a ton of exciting work happening in anomaly detection (and other techniques), but that’s not where 80% of the problem is for most businesses.
I’m excited about ecosystems like great_expectations, making it easy for you to write unit tests as part of your pipeline. At Atlan, we profile data quality and allow users to perform basic alerts and monitoring on top of that. Allowing a business user to write a business rule is not rocket science, and you don’t need deep learning to do that. You just need to apply that work to high-scale data. The critical thing is to start measuring your data and working towards improvement.
Data governance is specific to the organization. There are two kinds of organizations in the world:
I think data governance itself is being redefined. When we think about the concept of governance, it sounds risky and makes you less urgent, right? But governance simply means having processes and a foundation in place that help your team go faster. That’s what governance should become. We are in a new age of governance — in which the start of governance is not compliance. The start of governance is making things more agile and helping companies become more data-driven. That’s how data governance 2.0 will look like. How can governance be an offense strategy rather than a defense strategy? That’s the type of question our industry still struggles with. The good thing is that no one has figured it out yet. If you’re thinking about this as a problem, that’s great. As an industry and a broader community, we all need to share investigative practices and work together to create standards that help the next set of organizations make sense of their data.
In general, every building block of a modern data stack is still under-invested.
There’s more maturity in the data ingestion, cloud data warehouses, and BI space:
The data ecosystem will see the next stage of innovation: In some spaces, the first winner has been created (Looked for BI or Snowflake for Cloud Data Warehouse). But the overall ecosystem is still early and will be here for the next 30–40 years. Because of that, we are seeing the second set of vectors: BI notebooks to disrupt BI, new data exploration tools, next-gen data warehouses, etc. So there will be the second wave of tooling generation. For the under-invested layers, we are starting to see the first wave of innovation.
Fundraising is not the goal nor the milestone. It is the necessary evil in some cases that you need to do to build your company. Some companies don’t need to be built as venture-backed startups. If you can build a successful and profitable business, there’s a lot of respect and pride for that. Our industry talks way too much about how much money a company has raised and way too little about real milestones like customer love, customer retention, and tangible impact. As an entrepreneur, you should build a company to solve a problem, not to raise money. It’s hard to wear that hat because our society tends to hype up that kind of thing.
You are the one building something amazing in the world, burning the midnight oil, and working 80–90-hour/day and weekend. Every person who works with you should feel lucky to work with you (even though it might not feel that way all the time). Therefore, pick who you work with very wisely. These people will work with you for a big chunk of your life, so pick those you trust and value. This means different things to different people. Founders need to understand what matters to them and reference check extensively. Speak to as many references as you can to understand how an investor will be during bad times. You want to surround yourself with investors who support you in both good times and bad times.
Atlan’s mission is to help the humans of data become more productive. Teams need the ability to learn from each other. Thus, we have the Humans of Data blog to share our learnings about data teams' structure, DNA, and culture. Open-sourcing these ideas fit as a quadrant of our broader mission. Many of these projects have been picked up by team members and pushed out to the broad data community. Community engagement is not our strategy. It just evolves from what we enjoy doing.
We realized that an equal part of building an amazing experience for our users is building an amazing team. I researched what it will take to build a great team and found that many of the companies that have endured were not the big tech companies. As I read more books about this topic, I stumbled on a fascinating book about McKinsey.
McKinsey is just a fascinating firm.
I spent time thinking about whether we could create that kind of loop for a company. The best talent has an amazing journey and grows exponentially when they work with you. When they leave, they become alumni and still contribute to the overall ecosystem and success of the company. This means thinking about building talent similar to the way of building SaaS products. HubSpot talks about the concept of the customer flywheel, and everything centers around one unit of a customer. Why can’t we think about an employee the same way?
That laid the foundation for how Atlan thinks about talent. Honestly, we haven’t fully implemented this, given the constraint of a startup. But I constantly think about how to attract the best to work with us, to grow them when they are with the company, to enable them to refer people to the company, etc. The more we can do that, the more chances that we can build a moat-to-people.
I am thankful for all these recognitions. The critical thing to recognize is that I did none of this myself. As a society, we tend to glorify entrepreneurs a little bit more than we need to. All of this came down to the ability to build an amazing team capable of doing amazing things.
Recognition to me is not a milestone nor a goal. If you do good work, it happens to you. The one thing that I’ve begun to realize is that such recognition can inspire the next generation of companies and founders. That you can be a 21-year old and make an impact on the world.