How to Keep your Data Integration Project from Being a Wreck
Your reports are on the struggle bus. Your data is all over the place. Everyone is complaining that they can’t get the reports they need, or spend too much valuable time getting mediocre reports. You and your staff are spending way too much time battling data systems that should be making your life easier instead of harder. Are you fishing around for days just trying to get the right information? Answering 20 questions a day from people who can’t get the data they need? Spending hours (or days, or weeks) cleaning and merging the same data over and over in various spreadsheets?
There are a lot of ways to store and analyze data out there. And there are even more vendors out there who will sell you a system that promises to take care of everything you need. Spreadsheets, vendor systems, custom databases, data lakes, data warehouses. It can be challenging to figure out what you need. Should you put all your data in one place? Try to build a data warehouse? What even IS a data warehouse? You aren’t sure what the best solution is to your data dilemma, and you definitely don’t know how much it will cost.
YOU ARE READY TO TACKLE THIS PROBLEM
You know you need a new solution; the patched-together vendor systems and spreadsheets are not working anymore. You have even managed to convince the board (or the Executive Director) through lengthy conversations, PowerPoint presentations, and possibly some cajoling and flattery, that they should carve out a budget for your project.
BUT YOU AREN’T SURE WHERE TO START
You are anxious, and you are getting impatient. The saying “eat the elephant one bite at a time” occurs to you, but right now you feel more like the elephant is standing on your chest. You have to make the best investment for your organization.
You contemplate the right way to use your scarce resources (probably your only opportunity to spend this kind of money) to make sure you and your staff have what you need when it’s all said and done. Do you meet with a bunch of vendors and compare whatever they give you? Do you try to build something in-house? A hybrid approach? You’ve got to get this right, because after you spend it, that budget is gone forever.
And knowing that only half of IT projects get finished on time and on budget makes the stakes for this even higher.
Every vendor will tell you that theirs is the perfect plan, and that the price they quote will get you everything you need. But they often don’t have enough information to know that. Have you ever heard the adage that every technology project takes twice as long and costs twice as much as planned?
YOU ARE AFRAID OF GOING OVER BUDGET AND BLOWING UP YOUR TIMELINE
These are the stories we hear all the time:
- You set a budget, and it’s a one-time spend. You get bids from vendors, or from your IT department, and halfway through the project whoever is building it tells you it will cost more money. Or, you get to the end of your budget and the project isn’t finished yet.
- Someone promises you the sun, the moon, and the stars, and every feature you thought to ask for (off the top of your head) but when the build is done (or you run out of time and/or money) the system doesn’t do what you need it to. The data isn’t integrated, it’s not the right data, those aren’t the reports you need, or you can’t get the right users onto the system.
- You’ve gone out on a limb to make a big ask of your organization (or even your best funder) and now you are over budget, out of money, and don’t have the data system you promised everyone. This is embarrassing.
This happens a LOT, but it doesn’t have to be this way.
We get it. You need a better, more efficient (possibly automated) way to get the reports you need, and you can’t risk having the project budget and timeline blow up on you. Your boss (or the board) will never trust you again. You’ve got one shot at this.
IMAGINE IF YOU COULD…
- Know exactly what you need, and have a plan that both you and your potential vendors can both understand.
- Determine the best approach for your organization - whether it’s using existing technology, purchasing a whole new system, or using a combination of tools.
- Get the system you need to solve your problem, on time and on budget.
- Look like a rockstar to your board and your colleagues.
You can. That’s why we created the Impact Blueprint. And we are going to show you how you can do it.
We’ve helped over 150 nonprofits solve data problems like yours. We are happy to share the steps that we go through to find the best solution to data and reporting problems, and what the budget would be to address it. We love helping nonprofits get the best fit– without wasting time and money!
THE POWERFUL IMPACT BLUEPRINT PROCESS
There are three steps we use to create an Impact Blueprint, and make a data project 100% successful, so you can get the effortless and accurate reports you need (and no longer work on weekends). Now you will be able to use them, too.
1. Interview People Who Use the Data
Start by interviewing key people in your organization to understand who is using which data, for what purposes, and what systems are most important to them. Confirm what the high-priority data is, and where it lives. Listen for unusual sources or uses of data. Confirm what you think you know, ask questions about what’s really important to you, your organization, your mission, and your bottom line. Listen for “shadow data” such as spreadsheets that someone is using to track information that’s important to their job, but isn’t captured in official data systems.
Listen for complaints about what’s not working, as well as bright spots about what works well. Also really dig in to learn about what staff have to do to get the data they need out of the system, to integrate it together properly, and to analyze it to get the answers they need. You will be astounded at the many data sources, steps, and amount of time people take to get reports. This work can be unknown to recipients of the reports, such as organizational leaders, and this is the time to uncover it.
- Focus on your reports. It’s easy to go down several rabbit holes at this stage. Remember that you want to understand how people use data and data systems for reporting. What do they have to do to make the data accessible and use it?
- Don’t leave out the person in the back office who complains all the time. Sometimes this person has a unique use case, or even a long history of watching data projects fail. You can learn a lot from them.
- Interview the people who work with the data every day, as well as the higher-ups who make decisions (or would like to) using the data. Don’t get caught in the HIPPO dilemma (giving too much sway to the HIghest Paid Person’s Opinion).
- Confirm what you hear with each other.
2. Assess Your Data Systems
A data system is any place where data is stored. A data system could be built in something as complex as Salesforce or something as simple as a series of Google Sheets.
Begin assessing systems by asking whoever manages a system to describe what they do and show you some of the more common tasks if they do not sound familiar or if the process sounds complex. Some guidelines for accessing a data system are:
- First, check your system permissions for the ability to manage users. If you can manage users, it means you have full admin access. (Usually one of the highest levels of access in any system is the management of users and permissions).
- While you are exploring users, you can also take a look at groups, roles, permissions, and permission lists: these groups and roles might give clues to how the system is used and what the various user roles are as well as what users are allowed to do.
Inventory the Data
Inventory and Modeling both involve multiple rounds of identifying, documenting, and validating aspects of your data ecosystem.
Step 1: Gather all the data sources you’ve uncovered. This could involve looking at outputs, or logging directly into the systems. We often prefer to compare spreadsheets, either dumped from the system, or spreadsheets we’ve created for each system to document what data is contained in it.
Step 2: Data Inventory: For each data sample, identify the following things:
- Which columns contain important information? Important information is high value, and needed for integration and reporting. Don’t assume that because you have the information it’s valuable.
- Locate unique identifiers (primary keys)
- Note the completeness and accuracy of each data set. Identifying data quality issues is extremely important as it’s often a stumbling block to implementing a solution (of any kind).
- This is a good time to have a Data Analyst do a LIGHT quality control check. Look for things that might impact your recommendations when you get to the architecture design.
- Identify the common information in different data sources that would connect them to other data sets (the primary keys between data sources). Is the information stored in the same way? Do unique identifiers need to be created in order to join information from two data sources?
Specify Reporting Needs
- Ask the people you interview (as appropriate) what reports are important, and make sure to get copies of as many reports as you can. This will give you a sense of what the end result needs to be, data-wise, and you can dig in a bit to how those reports are constructed.
- Identify and document valuable key metrics and reporting outputs that make up those reports, make sure you know where they come from, and what data transformations have to take place to create them.
- Early in the process, decide if you need to consult with anyone who has expertise in a particular software or content area (this may involve a research and evaluation expert, marketing expert, Salesforce or Hubspot pro, or member engagement expert)
- Document feedback on system configurations relative to desired reporting and metrics when they come up.
Model the Data
A data model organizes data elements and standardizes how the data elements relate to one another. Since data elements document real-life people, places and things and the events between them, the data model represents reality. For example, a house has many windows or a cat has two eyes.
Data models are often used as an aid to communication between the people defining the requirements for a system and the people defining the design in response to those requirements. They are used to show the data needed and created by business processes.
In the case of the Impact Blueprint, the data model is used to describe the data that is most important, and will be necessary for the resolution of whatever data and reporting problem you have.
A data model explicitly determines the structure of data.
- First, now that you understand your reports, and what data would feed into those reports, map that out in a diagram.
- Second, look more specifically at each data system, and what data are stored in there, and in what format (using the data inventory that you did previously) to ensure that you have the data you need, and to understand which specific fields you need to build the reports.
- Third, describe how the current data sources communicate with each other, and how data can be exported (API, download, etc)
- Now, take all this information and create a diagram that describes how all these fields in these data sources would need to come together to create the needed reports.
3. Create Your Plan
Develop the plan based on the priorities, resources, and timelines you identified earlier in the process. Your findings, both organizational and technical, should lead you to recommendations and the resolution of your reporting challenges.
First, figure out what format each data source is in. Would you access it through an API and connect live to the source? Or is it more feasible to download data regularly and use a spreadsheet?
- This will be influenced by whether the local data source has a robust API, and also whether you have the expertise to work with an API.
- This decision may also be influenced by how often you need to update your data. If you update your data once a month or less, it may not be worth dealing with the technical issues of an API, or fixing it when it breaks.
Second, figure out how you will integrate and store your data sources. Consider asking the following questions, now that you’ve completed your discovery process.
- Do I need reports that combine data from 2 or more sources?
- What type of repositories house that data currently?
- How frequently are my data sources updated (more or less than 1x/month)?
- How frequently does the report need to be updated?
- How much data am I managing (if you downloaded it, could it fit in a single spreadsheet - or will it break Excel)?
Think about your data extract, transform and load (ETL) processes. What has to happen to the data to prepare it for analysis? You might be doing these ETL steps manually, using spreadsheets (pulling multiple spreadsheets into one master spreadsheet) or automating them using something like Python. You can consider loading the data into a SQL server or a data warehouse. Not sure what a data warehouse is? Check out this blog post. You might even decide to use an Integration Platform as a Software Service.
- This decision will be influenced by what expertise you have. Organizations who don’t have data engineers or experienced analysts on staff will often resort to spreadsheets. For small data, this can be the best choice.
- Using a data warehouse requires more expertise and you need to be comfortable housing your data off-site. It is likely far more secure than spreadsheets or local databases, and should be just as secure as whatever vendor solution you are using.
- Finally, decide how you will analyze and visualize your data. And how you will connect your data to the reporting tool you use. Some tools, like Tableau and Microsoft PowerBI will allow you to upload spreadsheets and cache them for visualization. If you’ve decided to go with a database or data warehouse, think about how to create a live connection to the tool of your choice.
- Reporting tool decisions usually depend on what your organization is already used to using, whether and how you need to share visualizations within and outside of the organization, and what the licensing costs are for your use case. There are many finer points of choosing the right tool, and we are happy to talk more with you about this.
INVEST IN THE SUCCESS OF YOUR DATA PROJECT
A data project that goes south is always costly. Your entire budget can be blown on something that doesn’t meet your needs at all, to say nothing of the cost of your staff time. Even a moderately successful project often involves unexpected costs for additional features, the inclusion of more data, or reporting features you didn’t think you needed. We know of one nonprofit that spent $3 million on a data system only to retire it two years later, because they didn’t go through this process or one like it.
According to a recent study published by TechImpact, fewer than 50% of respondents completed a technology project in three months. Even though doing an Impact Blueprint takes time, it will save you time in the long run, and will drastically reduce your risk of cost and time overrun. Fortunately, we can do a full Impact Blueprint for you in only eight weeks.
YOU could go through these steps yourself, or you could hire Inciter to do it, but either way, this detailed assessment and planning process is vital, and will save you time and money.
Don’t wade into another risky data project without going through the Impact Blueprint process first. Get it right the first time and be certain of what your results will be.
Schedule a free consultation with us today, and let us tackle your data. We will make it work for you, for your donors, and your mission.