Zimmerman Prosecution’s Voice Expert admits: “This is not really good evidence”

ANDREW BRANCA

JUN 8, 2013 8:00 AM

Two big developments took place yesterday:

1. An Exemplar of Trayvon Martin Speaking Was Obtained by ABC News

2. At the Frye Hearing the voice experts for the prosecution were examined, and they didn’t do well.

: Trayvon Martin’s cell phone.

1. Exemplar of Martin Speech Obtained — ABC Voice Expert Suggests Zimmerman the Screamer “by significant margin”

A major missing piece of the audio forensics evidence in the Zimmerman case has long been the lack of an exemplar–or example–of Trayvon Martin’s voice. This has greatly complicated the task of identifying whether it was Martin, Zimmerman, or some combination of the both of them who were producing the screams in the background of Witness #11’s 911 call. Yesterday ABC News reported that it has exclusively obtained a sample of Martin’s voice (although they also state that both the prosecution and defense possess the audio, there is no indication of when they might have acquired it).

Only brief snippets of the audio recording have been played on ABC News. I have no idea if these snippets are the entirety of ABC’s find or just a small portion. In any case, I have captured these snippets from news broadcasts, combined them into one audio file, and then applied the “Owens methodology” to repeat the combined file three times in order to achieve 16 seconds of total audio. (For an explanation of the “Owens methodology”, see below. FYI, I do not anticipate using my altered version of this audio for forensic purposes). You will recall that the speech experts were attempting to identify the identity of the person screaming in the background of the Witness #11 911 call.

For comparison purposes, the Witness #11 911 call:

In yesterday’s testimony by Dr. Reich’s he refers to a time when he didn’t yet have a Martin exemplar, suggesting he may have acquired one since. ABC News sent Martin audio sample to forensic analyst Kent Gibson of Forensic Audio. He tells ABC that “a comparison with Martin’s voice, Zimmerman’s voice and the screams on the 911 tape indicate that the voice is more likely to be Zimmerman than Martin by a significant margin.” He noted, as did Dr. Horotaka Nakasone in yesterday’s Frye hearing, that only two seconds of the audio screaming is sufficiently unmuffled be useable, and that any positive identification of the screamer is therefore unlikely. Both Tom Ownes and Dr. Alan R. Reich, expert witnesses for the state on the analysis of the Zimmerman case audiotapes, testified in today’s Frye hearing, and the bloody details are below. The Frye-hearing will continue tomorrow morning–yes, Saturday–at 9AM.

NOTE: Read to the bottom of this post to access the bonus “Crazy Thought of the Day about the Zimmerman Case”.

2. Prosecution’s Voice Expert: “This is not really good evidence.”

Yesterday the state attorney presented two of its expert witnesses on speech recognition and speaker identification to the court as part of the ongoing Frye hearing to determine whether their testimony should be admissible at trial. Having listened to both appearances live (-stream), my overwhelming perception is that it was a disastrous day for the prosecution on this issue, and the outcome will only be worse if either of these “experts” is put in front of a jury.

Tom Owens, Owens Forensic Services

The first expert witness was Tom Owens, owner of Owens Forensic Services. Mr. Owens conducts voice recognition analysis using EasyVoice Biometric software, a commercial product he also offers for sale. Mr. Owens academic background is an undergraduate degree in History, and no formal training in mathematics. As a teenager and young man he held various positions in the sound industry, typically involving theater or music. In 1981 he started his audio and visual forensics business. Because Owens was a prosecution witness direct examination was undertaken by the state attorney. Mr. Owens came across as a laid back and gregarious fellow. Unfortunately, things began to unravel rapidly once cross-examination began. Among the key revelations:

Owens Exaggerates His Relevant Experience As an Expert Witness for Zimmerman Case

When asked by the prosecution about his experience Owens stated worked in the field for 50 years, and that he had testified that he had served as an expert witness in court more than 300 times, including in Florida on both the state and Federal level. In fact, his relevant experience for this trial, meaning his experience in using the EasyVoice software for speech recognition and speaker identification and his appearance as an expert witness regarding that use, was far more limited:

West: “How many times have you testified in court specifically about the EasyVoice software that you used in your analysis for this case?

Owens: “I only testified once or twice on the EasyVoice software.”

West: “You’ve only had the software for a couple of years, and you’ve only testified in court about it once, is that correct?”

Owens: “Correct.”

Obfuscation on Whether EasyVoice Software Meets NIST Standards

Mr. Owens also proved prone to making exaggerated claims for the EasyVoice software and his own application of it, only to have to rapidly backtrack when pressed with facts by the defense.

For example, when asked if the EasyVoice software had been submitted to NIST, the Federal agency that establishes standards for such matters, he answered in the affirmative.

Owens: “The company has.”

West: “Is it your testimony that this software you used for this analysis has been submitted to NIST?”

Owens: “I don’t know if my version of the software was submitted, but they submitted the product.”

West: “Do you know how it did?”

Owens: “Fairly well.”

West: “What is ‘fairly well’?”

At that point Owens tried to read from what he called “the NIST letter”, but which turned out to be a letter sent from the manufacturer of the EasyVoice software to NIST. In any case, he was not permitted to read from it as it was not in evidence. He promised to email it to the court.

Further, when asked whether the NIST evaluation of whatever software of the EasyVoice software had been based on screams, as had his analysis, Owens replied that he didn’t know.

Turns Out Owens Has Been Using Current EasyVoice Only A Year or Two

Similarly, Owens was asked if he was using the most current version of the software. He said he was. He was then asked ho long he had been using that software, and he said two to three years. When West asked if it was correct, then, that the EasyVoice company had not updated its software for two to three years? Owens quickly backtracked and said he had misspoke and that he had actually obtained his current version of the software “maybe a year or so ago, I can’t really recall.”

Surprise! Owens Has a Financial Stake in Sales of EasyVoice Software

Although not remarked upon in Owens’ introduction to the court by the state attorney, defense counsel revealed in cross-examination that Owens has a financial stake in sales of the EasyVoice software. This obviously suggests the possibility that he might be biased towards findings of the software that meet client’s “expectations”. Worse, even once the financial interest was disclosed Owens continued to obfuscate about the extent of his interest.

West: “What is your financial relationship with the distributor of this EasyVoice biometric software?”

Owens: “That depends on the question.”

West: “No, tell me exactly how much money you make when one of these are sold.”

Owens: “I make a small percentage of what they make after expenses.”

West: “Please explain how that works, how much it costs, who gets what, including you.”

Owens: “How is that relevant?”

[Pause.]

West: “Will the court direct the witness to answer the question?”

[Court does so.]

In fact, Owens would soon be obliged by the defense to reveal that rather than earn “a small percentage of what they [the distributor] make after expenses,” his commission is 50% of the distributors profit, as much as $1,250 each time a license for the software is sold.

West: “So, you make money every time one of these programs is sold?”

Owens: “Yes.”

Methodology? Sure, got some right here.

Similar recalcitrance was encountered by the defense when they asked about the methodology Owens applied in the Zimmerman.

West: “Can you cite any research or studies that justify that method of speaker identification?”

Owens: “I told you, I did a study in 1985 and did a video tape and presented it at a conference.”

West: “I’m talking about research or studies by anybody but yourself.”

Owens” “I’ve not been able to find anyone who has done the same approach.”

Owens Admits Audio Recording Too Short for Analysis—So He Loops It

One of the most shocking moments of the testimony, however, involved Owens’ odd methodological choices, in particular his response when his EasyVoice biosoftware informed him that the screams were of too short a duration to enable analysis.

West: “What was the total duration of those 10 screams?”

Owens: “About 7 seconds.”

West: “I believe you earlier testified that you want 16 seconds of speech for the software to work reliably.”

Owens: “The software would like to see 16 seconds. If you put in 7 seconds the software says it’s not long enough. So I doubled it up. I repeated the same audio twice, back to back, and then the program would run the analysis.”

West: “Are you aware of studies or research of this software using that method you describe where you enter less than enough speech by doubling up.”

Owens: You don’t enter less speech by doubling up, you increase the amount of speech.”

West: “So, if I say ‘Hello’ and repeat that 16 times, is that 16 seconds of speech that’s appropriate for your software?”

Owens: “No, [stuttering] as far as doubling up it just provides enough to make a decision.”

West: “So in other words unles you had doubled it up and looped them the software would have rejected the sample.”

Owens: “A screen comes up and tells you that it won’t run. We didn’t have any more words to give the machine, so I doubled it up, becaue that’s all we had.”

West: “Because it then thought there was more speech than it previously had.”

Owens: “It knew there was more speech.”

Owens Admits Key Step in His Methodology Was Novel

As might be expected, Owens became increasingly defensive, and made corresponding missteps.

West: “When you realized that there was a problem with the speech sample, besides all the other problems, it wasn’t long enough for the machine to be able to conduct a reliable analysis, is that correct?”

Owens: “It wasn’t long enough for the machine to do its analysis.”

West: “So at that point you contacted someone from the company to ask them what to do?”

Owens: [Angry] “I’ve been using this software for 15 years.”

West: “I thought you said only 1 to 2 years?”

. . .

West: “You’ve never done this before, with this software, by looping the unknown sample until you have enough duration to put into the machine?”

Owens: I never had a program before that put a flag on the screen that you need a larger sample. When I had that issue I went around to talk to different people and tried different things and arrived at a conclusion.

West: “So this is brand new stuff, isn’t it?”

Owens: “Brand new about the looping.”

. . .

West: “So, you needed 16 seconds, and you found 7.

Owens: “The software recommends 16 seconds. . . . “

West: “You knew that was half of the recommended minimum speech sample.”

Owens: “That’s the recommended, it doesn’t say you can absolutely not do it.”

West: “What is the minimum amount of speech required for a reliable analysis?”

Owens: “Twenty words is the minimum amount of speech.”

West: “So, if you count the words, you have about 14 wrods in the one 7 or 8 second section.”

Owens: “Yes.”

West: “And how many of those words were ‘help’?”

Owens: “We had, ‘I need help,’ ‘help,’ ‘help,’ ‘help,’ ‘I want to help,’ ‘help,’ ‘help’.”

West: “So do you think that the 20 word minimum means the same word repeated over and over, or are you looking for something phonetically balanced?”

Owens: “What’s your definition of ‘phonetically balanced’?”

“Standards” Appear to be Amorphous in Owens’ Analysis

Similar exchanges occurred when the defense asked about the standards Owens used in his analysis for the Zimmerman trial.

West: “Did you use [aural-spectrographic analysis] to reach your opinion in this case? If so, what standards did you use?

Owens: “There’s a single page paper which a friend of mine and I published in 1996 which addressed these issues very carefully.

West: “So what are the speech standards for aural-spectrographic methodology? How much speech is needed, does stress screaming influence things, and what process is used for analyzing recordings under aural-spectrographic analysis?”

Owens: “There are a set of standards published on my web site, you can download those.”

West: “Why don’t you tell us.”

Owens: “I’m in the process of updating those standards, they reference other things that are out of date. . . . “

West: “My question is what standards did you use in this case, what standards you tried to apply, and wehther you had to deviate from those standards to do you arual-spectral analysis?”

Owens: “If you go back and read my papers and you read the standards that are published, it tells you what the standards are.”

West: “Tell me, for example, how much speech you would need to do the analysis under your standards.”

Owens: Twenty words. It also calls for a ceratain amount of clarity.”

West: “And you did not have 20 words in this case.”

Owens: “Correct.”

West: “So, how were you able to do that [analysis]?”

Owens: “I can’t make a positive ID, I have to make a probable or highly probable.”

Perhaps most amazing is when Owens simply threw the audio recording evidence under the bus, sating that “it’s not really good evidence”.

West: “So either you had to abandon your standard or devise a new approach for this case.”

Owens: “In many cases in these analysis you don’t have 20 words, especially for 911 calls. [The standard] is 20 words for good evidence, and this is not really good evidence. The next level down is 10 words. Can you make a positive ID with 10 words? No. I did not make a positive ID, I made a probable ID.” [emphasis added]

West: “You did not have enough speech sample including length or number of words to do your standard aural-spectral analysis.”

Owens: “Incorrect, I did not have enough to do a positive analysis. I can’t say it’s positively not Zimmerman.”

West: “You were not able to reach a conclusion.”

Owens: “No, I reached a conclusion that it’s probable, rather than positive, identification.”

West: “Do you quantify that in any way?”

Owens: “What do you mean by ‘quantify’? I don’t have enough words to do a positive identification.”

West: “So you were not able to reach a conclusion definitively.”

Owens: “I can reach a conclusion, I just can’t say positive yes or no.”

. . .

West: “Did you match up the screams of the recording with exemplars of Zimmerman and listen to them back-to-back or sound-to-sound?”

Owens: “I listened to the screames that were fairly audible, and there were several and compared that to him saying ‘help’ on the re-enactment audio.”

West: “So the re-enactment exemplar was not, you would agree, done under the same levels of emotion as the 911 recording?”

Owens: “Correct.”

West: “It’s not a comparable exemplar.”

Owens “Correct.”

West: “Are you aware of any published studies that support the comparison of an unknown comparison of a scream to a known speaker?”

Owens: “That’s the whole purpose of raising the pitch of the exemplar to match the scream.”

West: “Are you aware of any published studies that allow you to do that and then conduct an aural comparison?”

Owens: “I’m not aware of any studies, per se.”

Standards? What Standards?

Owens’ testimony closed with some flailing around the subject of standards, or the lack thereof.

West: “On this idea of standards, are you saying they don’ t mean anything, that any individual examiner can set whatever standards they want for themselves and it’s still accepted in the community?”

Owens: “That’s a ridiculous statement.”

West: “Tell me how.”

Owens: “The reason for standards is to set guidelines. I wrote those standards, I and about 10 other individuals, and in doing those we had to decide what was the best evidence—20 words—and what was usable—10 words—and what was iffy.

West: “So they are recommendations.”

Owens: It’s important to have standards in any science, important to have the best evidence, too, we would all like tihi sto be the best tape. It is what it is, so we have to take what it is and try to make the best job we can to make a determination.”

West: “And in order to do that, you had to substantially deviate that you yourself drafted with many others however many years ago.”

Owens: “I didn’t’ substantially deviate, I just didn’t have 20 words.”

West: “You didn’t have anywhere near 20 words, did you.”

Owens: “I had 10 words.”

West: “And most of those were the same word over and over.”

Dr. Alan R. Reich, Speech Scientist

Alan R. Reich is a former professor from the University of Washington who received his PhD in “speech science.” Dr. Reich has sporadically published scientific articles on various aspects of human acoustics from the mid-1970s to the early-1990s. Dr. Reich was retained by the Washington Post newspaper to analyze the recording. The Post reports that he has been involved in hundreds of criminal and civil cases for more than 35 years. I’m afraid that I can’t do the same justice to the testimony of Dr. Reich as was possible for Mr. Owens, for the simple reason that Dr. Reich’s testimony was all but impossible to understand—even the court reporter had to occasionally have the judge stop his testimony and repeat himself. Part of the difficulty was that Dr. Reich’s testimony was brought into the court via teleconference, and although the court’s equipment appeared to be state of the art, perhaps Dr. Reich’s was not. In addition, Reich’s voice was characteristic of someone who is either very elderly, has a serious speech impediment, or both. This is no reflection on the excellence, or lack thereof, of his expert testimony, I mention it only to explain the paucity of detail in this discussion of his testimony.

Reich’s Findings Still Inconclusive, With Trial to Start on Monday

Perhaps most surprising part of Dr. Reich’s testimony, with the trial set to start on Monday, was that Reich’s findings were described by him as still being tentative.

West: “I’m not clear on what you said your conclusions were, other than they were tentative, are you saying you have not reached a final conclusion?”

Reich: “I’m saying the number of data points, it is not possible to.”

West: “So you have not in fact reached an opinion, you just have a tentative impression.

Reich: “I have an opinion, but it can’t be very hard and fast because the circumstances are not the best.”

West: “So it’s an opinion, but you don’t hold it very firmly.”

Reich: “I wouldn’t say that I hold it firmly, but the circumstances don’t allow for extensive analysis.”

Standards? We Don’t Need No Stinkin’ Standards!

As with Mr. Owens, any effort to determine what standards had been applied in the analysis quickly bogged down in generalities and obfuscation.

West: “What are accepted guidelines and standards for speaker identification and analysis?”

Reich: “It depends on who is doing it. Some people say 10 words (seconds?) is the minimal number. No rules are really hard and fast.”

. . .

West: “I would like to explore the standards in your scientific community for speaker identification including duration, quality, etc., and if those standards vary I’d like to know which entities and associations use different standards. Then I would like you to tell me what your standard is, and how you applied it to this case.

Reich: “The FBI testified that their standard is 20 different words in the same sentence context. The features that are studied are pretty much the same from one group to another. The way in which they are prepared.”

West: “So the FBI uses 20 words. Are you familiar with any other groups that establish standards?”

Reich: “Yes.”

West: “OK, and as to the number of words or length of speech what is the standard there.”

Reich: “Ten.”

West: “Ten words, or ten seconds, or either one.”

Reich: “Ten words.”

West: “Are you familiar with any standard length fro the recording?”

Reich: “I’ve heard a number, I’ve never seen it printed anywhere.”

West: “What do you believe it to be?”

Reich: “Ten seconds.”

. . .

West: “For example, are there particular standards for screams?”

Reich: This word ‘standards,’ are these factors that are taken into account in doing the work, absolutely. Are there standards that are specific for a particular kind of emotional speech? Probably not.”

West: “So there are no strict standards that you have established for yourself on either the number of words or the duration or the state of the speaker.”

Reich: “I base my work on my review of the literature over many, many years.”

. . .

West: “What are your internal standards that you apply?”

Reich: “That there is a reasonable corpus of speech, it’s not something that you can put a number on.”

West: “So you do a subjective analysis, then, on if it’s suitable, there’s no hard and fast rule.”

Reich: “Why don’t you tell me what a hard and fast rule is, and I’ll tell you if I want one.”

West: “For example, that no speaker identification should be attempted, because it is not reliable, if the speech is less than 14 seconds long.”

Reich: “That would be overly conservative.”

West: “So, you would disagree with that standard and you would not apply it in your own work.”

Reich: “Apply as a hard and fast standard on 14 second speech? It’s not based on anything other than a committee decision.”

Reich: The Research I Follow May Not Be Appropriate for Forensic Purposes

West: “Tell me of the research you’re talking about when you talk about the standard that should be applied.”

Reich: “The research tells us what kind of minimal signal quality we need. It may not be appropriate for forensic evaluation, but nonetheless they tell us.”

CRAZY THOUGHT OF THE DAY: MIGHT SPEECH TESTIMONY OPEN DOOR TO DRUG EVIDENCE?

: Trayvon Martin apparently smoking.

Much has been made of the court’s decision to prevent the defense from introducing evidence of Martin’s drug use in their opening remarks. She has, however, reserved the option to allow testimony on his drug use if the circumstances warrant it. Hearing Dr. Nakasone’s testimony yesterday struck a crazy thought, particular when he was discussing how changes in a person’s affect can distort a person’s voice. One of the many factors he listed was intoxication. If the prosecution insists on introducing speech recognition or speaker identification evidence at trial, might the fact that intoxication could have altered Martin’s voice from the exemplar, and therefore tainted the speech analyst’s findings, open the door for the defense to introduce the evidence for Martin’s drug use??

Andrew F. Branca is a MA lawyer with a long-standing interest in the law of self defense. He authored the seminal book “The Law of Self Defense” (second edition shipping June 22–save 30% and pre-order TODAY!), and manages the Law of Self Defense web site and blog. Many thanks to the Professor for the invitation to guest-blog on the Zimmerman trial here on Legal Insurrection!

Tags: George Zimmerman Trial, Trayvon Martin

CLICK HERE FOR FULL VERSION OF THIS STORY