In this article we’ll focus on one of those steps specifically: how to work with third-party data providers to supplement your internal data.
To start, though, we’ll briefly discuss how to identify what data you need and how to get it. This could be obtaining the data from third-party providers or it could be developing your own internal resources. The conversations you have with data providers need to be framed in the lens of what you’re paying for what you’re getting.
Identifying the data you need
The first and most important step in obtaining data is identifying what you need in the first place. This will serve two purposes:
Achieve the desired outcome
A really deep analysis working backwards from what capabilities you want or problems you need to solve to what data is needed to achieve the desired outcome is key. You’ll not only have a better idea of what data is still needed, but you’ll likely develop a much better understanding of the process and workflow you’ll need to implement with the additional data.
You’ll often find that you have some pieces of the data you need for the objective, but are missing other pieces. Becoming clear on what data you have that is high quality enough for use in more advanced analytics and what additional data you need will guide where you need to look to obtain the additional data.
Avoid the shiny object syndrome
We’ve all heard the stories of people going to Target to buy more toothpaste and coming out having spent $250. By working backwards from use case to data needed you can avoid the “shiny object” syndrome that you may be presented with by data providers.
When firms spend money and time on data that they can’t or shouldn’t immediately use, often frustration and discouragement start to creep into the technology development process and this can kill any positive movement towards better analytics.
Working with data providers
Before we start, a quick caveat: this is not an article on data science or artificial intelligence, so we’ve made an attempt to provide enough information without going into topics that those without a technology background may not be familiar with. Also, there are likely hundreds of questions that need to be asked of data providers, but those below should be a good place to start the process.
Geography covered
Some data providers offer data only on select geographical areas, such as major metropolitan areas, and do not offer data on smaller areas. If the coverage of the data provider matches the geography of your interest, then it could be a good fit.
If the data only covers a portion of the area of your interest, you need to think very carefully about how much benefit you’ll get from the partial coverage and whether or not partial coverage will present any challenges to smooth operations and analytics across your entire business.
Cleanliness of data
Real estate data is notoriously dirty (meaning values are missing, data is in the wrong columns, formatting errors, misspellings, etc.) and difficult to work with.
If the data coming from the provider is still dirty and requires significant effort on the part of your firm to clean it and make it available for analysis, then you’ll be spending a lot more time and money than you probably planned to.
In many cases this may be necessary as it’s difficult to find and correct many of the idiosyncrasies of real estate data without knowledge of a specific property. If, however, the reason the data is dirty is because the provider is not taking additional steps to clean it, you may want to explore other providers with data that is further along in the preparation process.
Availability of data
There are many different ways that data gets from one place to another. Some providers require that you access information by individually extracting each set of information you want.
Others provide files that must be manually downloaded and additional steps that need to be followed to integrate the data into your system. Other providers offer highly structured API’s (Application Programming Interface) that allow seamless communication between one system and another. Depending on your use case and internal technical capabilities, you’ll need to decide which of these options best fit your needs.
Where they get their data
Much of the data available on real estate is from public sources. You want to be sure you’re not paying a data provider for data that you could easily collect and integrate on your own. As discussed above, however, many times this data is very dirty and the data provider spends a significant amount of time fixing this data.
Cleaning the data often requires data scientists with specific skills in data collection and cleaning, so it may be worth it to go through a provider even for publicly available data. Other times it may be more cost-effective to explore collecting and integrating the data on your own through either internal hires or consultants with expertise in data collection. This will require a cost analysis of each option.
Format of the data
When it’s clear that data will be combined with other data, either your own internal data or external data from multiple third-party providers, you need to make sure that the format of all the data is consistent.
By this we mean that if your internal data covers census tracts and the data provider covers zip codes, you may have difficulty merging these two data sets together effectively and accurately. Other formatting issues could be the time of collection.
If all of your data is aggregated monthly, but a data provider only provides quarterly aggregated data, it might not suit your purposes. Ensuring consistency of geography, periodicity (time interval), and other formats is key to a smooth transition from external provider to internal use.
Price
After reviewing all the factors above, you finally need to decide whether the price of the data from the provider is worth what you’re getting. Paying for dirty, incomplete data for the sake of having data hurts your firm more than it will help.
Paying too much for data that provides only marginal benefit will also hamper your technology development efforts and eventually will become a risk to the entire digital transformation process if it leads to discouragement in the firm.
Understand what you’re getting and what it’s worth to your firm. Then decide if that benefit is worth the cost. Note: this may involve bringing in outside expertise to evaluate some of the more technical aspects of a data provider transaction if your firm does not have the expertise internally.
It’s not likely that data providers will understand your business or goals as well as you do, so it’s not likely they’ll be able to vet what data is right for you. Only by understanding the nuances of your needs and the data available will you be able to effectively choose the right provider.
What to expect from data providers?
Throughout the vetting process, providers should be willing and able to answer your questions directly, show you examples and formats of data, and be as transparent as possible.
This doesn’t mean exposing trade secrets, but they should be extremely open about how their data and processes will help your company. They should also be transparent about what challenges you may experience with their data or what shortfalls their data has relative to your needs.
If questions are not openly addressed or the process becomes difficult, it should be a red flag as to how the relationship and efficacy of the data could evolve.
Working with as few data providers as possible
One final note on choosing data providers. It is usually recommended that you partner with as few data providers as possible. All of the factors above (geographic concerns, availability, cleanliness, format, etc.) will be dealt with in every single data provider you use.
Eventually, firms begin having difficulty integrating data providers’ data with their own data and also with every other data providers’ data. This can quickly become a mess and stall any forward movement you could have made towards implementing more advanced analysis in your firm.
Conclusion
Firms need to either adapt or face the risks associated with stagnation, but adapting ineffectively is just as dangerous as not adapting at all. Answering questions on strategy and backing into what you need to accomplish that strategy is the first step in the process.
Choosing appropriate partners along the way can make or break the potential of your firm in the future. By thinking more deeply about the topics above, hopefully you’ll be able to make decisions that better position your company for success.
Author
Josh Panknin
Director of Real Estate Artificial Intelligence and Innovation
Author
Josh Panknin
Director of Real Estate Artificial Intelligence and Innovation