Ep 47: Ramp's $8 billion data strategy (w/ Ian Macomber and Ryan Delgado)
Elevating data as an equal to product engineering and design
Ian Macomber, head of analytics engineering and data science at Ramp and formerly the VP of Analytics and Data Engineering at Drizly, and Ryan Delgado, a staff software engineer at Ramp, have played pivotal roles in establishing Ramp's data team from the ground up and are spearheading the development of their comprehensive roadmap.
In this conversation with Tristan and Julia, Ian and Ryan share insights on how Ramp's data team transformed unstructured data from contracts into valuable insights to enable faster decision-making. The $8 billion company values speed and empowers teams to build, ship, and measure products quickly. Ian and Ryan also talked about their approach to adopting new tech and elevating data as an equal player alongside product engineering and design.
Listen & subscribe from:
Read below the key takeaways from this conversation.
What is Ramp Intelligence? Why did you take it on? How did it come from the idea to the customers?
Ian Macomber: Ramp Intelligence is a broad suite of LLM-powered solutions embedded through our platform, including Vendor Price Intelligence, an accounting co-pilot, a contract digestion and negotiation solution, and a lot of automated accounting processes. I'll talk a little bit about price intelligence and accounting, which are the two products I've been closest to.
Generally, Ramp has a ton of contracts and bills from vendors. If you think about four years ago, it would have required a lot of human operator time to go through these contracts by line basis to see what's included. They're not standardized. They're not JSON blobs. There's SKUs, there's pricing, there's seats, there's contract length. You would have had to read hundreds of contracts to see what's reasonable, and that's exactly what we did to get started and really understand this problem.
We prompt LLMs, including GPT 4, to parse these contracts from unstructured data to very structured and standardized form. That allows us to aggregate and determine how much is paid for specific product plans and understand the shape and distributions here. Using millions of historic credit card bill pay transactions, it allows customers to benchmark the price of their software purchases.
Tell me a little bit about where you started and what motivated your change in thinking.
Ian Macomber: This is probably one of the biggest 180s I've done in my time at Ramp. When I joined, I was really of the mindset that our SQL and Python code base is our baby. Our standards are high. I thought that our code quality would degrade if we allowed others to join and contribute. Since then, the biggest thing I've learned is that business logic always finds a way to be written. We have really smart and technical and resourceful and fast-moving stakeholders, and what we found is that core reporting ended up in other systems, and oftentimes, it was unexpected. We have a lot of business logic at the edge.
Everyone probably has some story of an Excel sheet that powered most of the business decisions, a super complex Looker custom dimension that should have been refactored into dbt. Maybe someone's personal Jupiter notebook that they run every once in a while to update the finance team. I can tell you exactly when you find out about things like this. The only way forward is to welcome others in.
We changed a lot of our philosophy. We wrote our documentation, we changed our standards, and we invited others in to learn how our team works, how to contribute to our code base, where things should go, and why.
There's definitely been hiccups along the way, but I think overall it's led to healthier code. People are excited to write dbt, especially people that are not necessarily on our team. We have a lot more centralized business logic as well as visibility into usage. As a result, we're moving a lot faster.
The big thing I learned is always ask your stakeholders what they actually look at to make decisions. You might think and hope it is the gorgeous Looker dashboard built on beautifully normalized Kimball model dbt but oftentimes it's not. You'll often be surprised. It's on us as data leaders to look at what people are actually using to make decisions.
Ramps' been really good at adopting new tech quickly. How has that been indoctrinated in your team's philosophy, and how has that gone for you in adopting new tech quickly?
Ian Macomber: We really like to take bets with asymmetric upside and really invest in slope and spikiness. Whether this is hiring, whether it's data science research, building products, whether it's vendor selection, we try to ask "What does this person do better than anything? What does this tool do better than anything? Can I imagine in my gut that a year from now, they're going to be even better at it.?" A phrase that I love generally is code is not an asset. Code is liability. If a teammate's thinking about something, I say "Don't just tell me why you're going to build it. Tell me why you're going to commit to maintaining it in perpetuity." You think about moving data from Google ads to Snowflake. Ramp needs to do that. Everyone needs to do that. For us, that's not strategic. We can take a dependency on Fivetran, and I'm confident Fivetran is going to be better at that next year than this year.
When we think about the vendors that we like to bet on, what we're saying is, "I bet the value that this person is going to be able to deliver next month is going to grow faster than we could possibly build in-house or buy from a mega corporation."
I also think, "What's the worst that could happen?" Because oftentimes, whether it's individuals or products or vendors, they don't all have to work. What you really need to find are those massive upside bets for the business. Certainly we have a lot of partners that we've worked with where we have grown on their backs.
Ryan can talk a little bit more about that as well.
Ryan Delgado: We work with a lot of SaaS vendors as part of our operations and data platform. There are niche problems where there isn't really a Snowflake to solve that problem. Things like handling data privacy while being able to move quickly. With something like that, we need to make a decision between either building it ourselves or trying out some vendor or managed service. The calculus for me looks like buy versus build. I either hire a full time engineer and have them build something for three, six months, and then maintain it for the rest of their time here.
I could pay managed service for less than the full time engineer salary and get it immediately. But the risk is that with buying from young startups, the product may have issues with stability, or maybe it doesn't do everything that we want it to do right away. It's an issue that we've encountered many times and need to mitigate. The key there is to provide regular feedback and drive the product roadmap. I'll be very specific about what we want and hold them accountable for delivering on those objectives. Additionally, we just need to continuously evaluate the vendor relationship, ask questions like, "Are things improving over time? Is the software becoming more stable? Are they featurizing the product the right way? Are we getting more value out of it than paying for the long term?" If we realize that we're getting more pain than value, we need to end the relationship and cut our losses.
How are you organizing your dbt code base to enable these different humans to have the decentralization that they need, but also to not get lost in their own islands?
Ian Macomber: Maybe I can take a stab at that. Something that I think I've learned is 24-hour calculations are ultimately things that end up being used in production at low latency.
Tristan Handy: I want to put that on my wall. It's like, you may not realize this is true at the outset, but it will become true sooner or later.
Ian Macomber: Two examples of this are when I got here, two relatively complex calculations we did every day were the amount of money in a company's bank account and their delinquency status on Ramp. Those were calculated every 24 hours and used largely for reporting.
Knowing if a company has lost a lot of money in their bank account is really important. So, in some instances, I would say these things graduate up from dbt Snowflake to dbt Materialize and are still the purview of the data team.
In other situations, we say, “Look, this is no longer a reporting metric, this is a state of a business and it is a production grade input to all of our systems.” We actually want to adopt and rewrite this in our core transactional databases. In terms of organization, we have seen some stuff not only graduate out of Snowflake and dbt, but out of the hands of the data team entirely. We view this as a massive success.
Tristan Handy: This is something I haven't honestly spent a lot of time thinking about. Is there a world in which we want dbt to be able to support that type of production criticality? I think you're entirely appropriate in putting dbt largely in the bucket of supporting reporting workloads today. But it's an interesting question that I don't feel decided upon over the next 5 to 10 years time frame.
Ian Macomber: That's right. I think another question is almost going in the other direction. For example, if you build something in Snowflake, you're not going to power your production app on Snowflake, but you can say “Hey, Snowflake at the end of this calculation, let's make sure that this table can support transactional workloads and not just analytical workloads.” You can imagine a world where a data team is a little bit more able to provide delinquency status to a business.