With most data being silo’d and disconnected, automated processes often need to combine data in order to produce valuable information and insights. This includes mapping, combining and reconciling data.
In this post, we want to discuss the art and science of mapping and reconciling data and provide insights as how you can optimise and improve your data mapping processes using SolveXia.
The first important reason for matching is to map or join data together. This is often done to enrich data so that it is more useful and informative. For example, if a data set contains only a code to identify a division or cost centre, we can use that code to identify the name of the division:
By doing this, we improve the quality of our reporting, making it more meaningful for stakeholders. This type of processing is very common, particularly due to the disparate nature of systems and data. Common use cases include:
The other key reason for matching data is to perform a reconciliation. This can sometimes be a catch-all term, describing an entire process, such as a bank reconciliation. However, reconciliations can also be incorporated as steps within a larger process, for example to validate and check that data is correct before publishing a report to management.
Reconciliations are part of the fabric of many accounting departments. Amongst others, they include account reconciliations, inter-system reconciliations and balance sheet reconciliations. The core component of a reconciliation is the idea of comparing two or more sets of data using a common identifier (keys) to extract differences. This includes keys that appear in one data set but DO NOT appear in the other. It also includes keys that DO appear in both data sets but DO NOT have matching values (e.g. a variance in a balance for matching account numbers).
When matching data, you need identifiers or keys that have a consistent format or structure. For example, this can include an account number, a department code or a transaction ID. Due to the plethora of systems and ETL processes in modern companies, the format of keys can sometimes change or be inconsistent across systems and data sources. This can happen in a few ways; first, the key in one system might have a sightly different format to another, for example with leading or trailing whitespaces or special characters (e.g. #) being added to the key. Another way this issue might occur is that the contents of a key for one data source might be spit across several columns.
In some cases, identifiers may not exist at all, requiring a key to be synthesised using whatever data is available. A common example of this is in financial transactions where the date and value of the transaction might be used (in the absence of a key or ID).
Another challenge associated with matching data is the sometimes varying levels of granularity between data sets. For example, one data set might be at the account level, while another is presented at the sub-account or transaction level. This can cause issues when mapping data, such as creating duplicate records. It can also result in false positives/negatives when reconciling data.
An issue more relevant to reconciling data is timing differences. This is common for tasks such as bank reconciliations where transactions are likely to be delayed before appearing in the bank statement. This adds complexity to the matching as there is a need to persist with unmatched data over time until the point is reached when you are able to match the data.
Ideally, when you match data, there will be a one-to-one relationship in records. That is, the matching will be driven by a unique identifier that appears only once in each data set (such as an account number). However, due to differences in granularity or a lack of a clear identifier, matching can become more complex, requiring one-to-many or many-to-many relationships. This often requires the person doing the matching to loop through different combinations of records and compare aggregated balances (for example multiple bank transactions being compared to a single transaction in the GL).
Once a matching process is completed, exceptions will often be produced. This includes cases where an identifier cannot be found in one of the data sets or where there are variances in values for matched records (e.g. a variance in the balance for two accounts). This can spawn several workflows where staff need to assess and deal with the error. This includes updating a mapping table or data set and repeating the process. It can also include making an adjustment (e.g. to a balance) in an auditable way and then continuing with the task. The permutations for error handling can become very complicated and difficult to manage, particularly in an environment where the work is being performed manually.
While matching data can be a challenging task, modern automation tools like SolveXia are equipped to handle even the most complex matching, mapping and reconciliation work. Thus, SolveXia should be a natural go-to option for any process issues in your company related to matching data.
SolveXia’s existing library of 100+ data manipulation instructions is ideal for cleansing and aggregating data. Some good examples of instructions for this purpose include:
SolveXia also has purpose-built instructions that are made for matching data. These include:
In addition to pre-processing data to deal with issues in identifiers or granularity, the instruction set can also be used to perform complex one-to-many or many-to-many matching. If you’d like to learn how to perform complex matching in SolveXia, I recommend reaching out to our support team via firstname.lastname@example.org who can provide you with a tailored one-on-one session to get you up and running.
You can create Managed Tables (SQL) in SolveXia to store and persist data. This is incredibly useful to deal with timing differences for reconciling data. For example, a daily bank reconciliation might produce exceptions which can be copied to a managed table. This data can then be extracted the following day and used in the reconciliation. This process can continue, day-by-day. The data can also be extracted to reports, such as a transaction ageing report.
The other benefit of using the managed tables is increased performance (i.e. faster processing).
To handle and have better visibility of errors in your data matching, you can utilise SolveXia’s dependencies feature. Dependencies allow you to branch a process based on the outcome of a particular step - for example, a step that checks for mapping issues in your data. In the screenshot below, steps 2.5 and 2.6 run only if step 2.4 (Map Data) fails:
Step 2.4 uses the Map Values by Key instruction (mentioned above) and has a failure condition configured as follows:
When errors do occur, you can extract the records having issues and use the Email Confirmation step to send them to users in the process. This step-type can be used to capture feedback and approval from users (e.g. certification that the user has reviewed and accepted the errors). You can also leverage the Request Files Via Email step to allow users to update and reload data (e.g. with corrections for mapping errors).