How can data files that monitor the same test call but come from disconnected sources be directly compared with each other? In the Post Mortem Analysis of T.30 and T.38 Traffic posting, I pointed out that QualityLogic’s DataProbe T30-T38 Analyzer has the ability to import fax call data from .wav sound files and .pcap WireShark data files. Several engineers have asked about the process of matching up the content of these files for analytical comparisons. This is a tricky proposition when there could be many calls (and other IP activity) in the WireShark file that must be parsed to find the call captured in the .wav file.
Assigning Heuristics to ID Numbers
To identify different records of the same call, QualityLogic employs a diverse range of heuristics to assign an arbitrary ID number to each detected instance of that call. This is mainly a matter of totaling up a check list of call attributes to see if they match up and rejecting matches that have absolute misses form the list. Any number of analog PSTN and/or Fax over IP (FOIP) call results that are parsed as representative of different captures of the same single fax call can be given the same ID number. An example of this display is shown below:
To start, a cache holds the parsed characteristics of all call results that have been added to the results database since the start of the current DataProbe execution. This includes all the calls found in imported data files. Each parsed result is compared to the cache to find the highest score of a set of call feature metrics. When these metrics indicate sufficient similarity, the appropriate ID number is assigned to the call result.
The process makes the initial assumption that a match has been found in the current comparison between two call records. Their call metrics are then subjected to the following criteria to see if they disprove this assumption. To start, incompatible telephone call start times will prevent a match. This is verified for live results captured directly by DataProbe by checking that start times match within 30 seconds. The answerer delay from ring to answer is subtracted for this process. For imported results, the start times are tested using a modulo of 3600 seconds, in order to match results captured in other time zones, or DST. Since imported .wav files show no start time they skip this test.
Inspecting The Content of the Fax Call
Don’t go away, we’re not finished yet. DataProbe then inspects the call’s content. Present but incompatible CSI/TSI/CIG field contents will prevent a call match. To avoid confusion, non-ASCII contents that are parsed from bad FCS frames will cause the CSI/TSI/CIG to display a ‘?’ character and DataProbe will skip this test. Additional metrics are then weighted and summed to find a similar call. This is done using coded bit fields created by parsing as set of call characteristics are AND’ed and also XOR’ed between the two call results. These bits represent:
- Modulation of the first page of the call
- Modulations used by all pages of the call
- Resolutions used by all pages of the call
- Encodings used by all pages of the call
They are counted up to produce a (+) or (-) effect on the call matching score. A call must achieve a minimum score to be indicated as a match. The result is that an imported call record will only be compared to another recording of that same call and if no match is found it is labeled with a unique ID number.