If you run a process where the input size of the files is sufficiently large and the quality of the underlying data relatively poor, you may receive the following message:
"Your data set had a large number of potential partial matches and unmatched records. Duco has found the partial matches by automatic grouping to cut down comparisons."
Meaning that Duco will not yet seek to calculate the optimal partial matches for your Process.
A combination of factors, including rules configured only partially, sufficiently large input data, poor quality underlying data, and a low partial match threshold, could otherwise cause the Process to perform with a very slow run time. So instead, Duco returns roughly matched preliminary results for you to make analysis and to improve your matching rules.
Duco runs better when the majority of records match and a small proportion are partial matches or unmatched. So Duco stops the process and gives you the opportunity to refine your configuration. There are a few things you could do, which are as follows:
1) Partition / split your data in some way, for example by a certain group of accounts, product type etc. You can do this with filter rules, so you don't have to prepare new input files (alternatively you could split the source data manually outside of Duco).
2) If you have some additional fields that you could readily match but have chosen not to, then matching these in match fields might help. To optimise Duco better and allow it to run with records with more matches. Alternatively, you can try to create a dummy match field if you don't have extra fields to match on.
3) Duco usually starts to struggle with this at 20k+ what you could do is run the rec with a smaller amount of data, view the results and configure it better. Then upload the full data set.
4) Drop the partial match score from 50% to 10% during set-up, this will reduce the amount of unmatched items and likely improve the match rate. You can then work with these results to identify rules and refinements that can be made to the process. Once you have the process production-ready, it's best practice to return to a higher partial match score.