1 |
lee <lee <at> yagibdah.de> writes: |
2 |
|
3 |
|
4 |
> is there a speech recognition software or the like which is capable to |
5 |
> listen in on a phone call in order to put on screen as text what the |
6 |
> other person is saying? |
7 |
|
8 |
I like to say that there are (2) main categories of effort here, one |
9 |
very do-able (a single voice), the other (infinite voices) plausibly |
10 |
intractable atm. |
11 |
|
12 |
|
13 |
|
14 |
> I'd like to connect that to a softphone so that someone who suffers from |
15 |
> very bad hearing can talk to people on the phone more easily. |
16 |
|
17 |
This is possible, if only a few voices; that have had their speech patterns |
18 |
analyzed, manipulated into storage with ample resources, then what you seek |
19 |
is possible, accuracy is the constraint. |
20 |
|
21 |
|
22 |
> If there's a phone capable of this, I'd like to know about it. |
23 |
|
24 |
If you are after a solution that can work with any voice, even limited |
25 |
to a single language, then the answer is a long way away. Some would say |
26 |
intractable. There is the question of accuracy required and the complexity |
27 |
of vocabulary, sentence structure and allowed nominal variation on the voice(s). |
28 |
|
29 |
|
30 |
> Surely we should be able with nowadays technology to achieve this. |
31 |
|
32 |
With google sized resources, you can masquerade the problem with templates |
33 |
for many different voices, but the underlying problems abound without limit. |
34 |
What you actually do is 'train' the google system to customize it's |
35 |
translation of a given voice, very accurately over time. |
36 |
|
37 |
|
38 |
Now say I disguise my voice with a throat infection, depressed attitude, |
39 |
exuberance etc etc, you can see the troubles. In fact, the day after |
40 |
watching horrible English cinema, which is often contagious (monty python |
41 |
--life_of_bryan), I often develop a temporary 'Manchester slang' in the |
42 |
vernacular. Endless, unlimited gyrations should one want to have a bit-o-fun |
43 |
with language, particularly when any number of 'hill_billy' contaminants |
44 |
manifest. |
45 |
|
46 |
My mathematical belief is the problem is intractable, certainly as you |
47 |
approach a high level of required accuracy. In fact folks routinely joust |
48 |
with one anther around the 'looseness of language' and the various varieties |
49 |
of layered meanings.... |
50 |
|
51 |
Truly intractable, but a 'dumbed down' simile surely will exist at some |
52 |
point. Google ibm in your searches as they did quite a bit of foundational |
53 |
research in a variety of related areas of (speech/sound/voice) research. |
54 |
|
55 |
Still, google's offering might prove acceptable for your needs. |
56 |
|
57 |
|
58 |
hth, |
59 |
James |