Gentoo Archives: gentoo-user

From: James <wireless@×××××××××××.com>
To: gentoo-user@l.g.o
Subject: [gentoo-user] Re: speech recognition?
Date: Wed, 18 May 2016 00:32:48
Message-Id: loom.20160518T020714-377@post.gmane.org
In Reply to: [gentoo-user] speech recognition? by lee
1 lee <lee <at> yagibdah.de> writes:
2
3
4 > is there a speech recognition software or the like which is capable to
5 > listen in on a phone call in order to put on screen as text what the
6 > other person is saying?
7
8 I like to say that there are (2) main categories of effort here, one
9 very do-able (a single voice), the other (infinite voices) plausibly
10 intractable atm.
11
12
13
14 > I'd like to connect that to a softphone so that someone who suffers from
15 > very bad hearing can talk to people on the phone more easily.
16
17 This is possible, if only a few voices; that have had their speech patterns
18 analyzed, manipulated into storage with ample resources, then what you seek
19 is possible, accuracy is the constraint.
20
21
22 > If there's a phone capable of this, I'd like to know about it.
23
24 If you are after a solution that can work with any voice, even limited
25 to a single language, then the answer is a long way away. Some would say
26 intractable. There is the question of accuracy required and the complexity
27 of vocabulary, sentence structure and allowed nominal variation on the voice(s).
28
29
30 > Surely we should be able with nowadays technology to achieve this.
31
32 With google sized resources, you can masquerade the problem with templates
33 for many different voices, but the underlying problems abound without limit.
34 What you actually do is 'train' the google system to customize it's
35 translation of a given voice, very accurately over time.
36
37
38 Now say I disguise my voice with a throat infection, depressed attitude,
39 exuberance etc etc, you can see the troubles. In fact, the day after
40 watching horrible English cinema, which is often contagious (monty python
41 --life_of_bryan), I often develop a temporary 'Manchester slang' in the
42 vernacular. Endless, unlimited gyrations should one want to have a bit-o-fun
43 with language, particularly when any number of 'hill_billy' contaminants
44 manifest.
45
46 My mathematical belief is the problem is intractable, certainly as you
47 approach a high level of required accuracy. In fact folks routinely joust
48 with one anther around the 'looseness of language' and the various varieties
49 of layered meanings....
50
51 Truly intractable, but a 'dumbed down' simile surely will exist at some
52 point. Google ibm in your searches as they did quite a bit of foundational
53 research in a variety of related areas of (speech/sound/voice) research.
54
55 Still, google's offering might prove acceptable for your needs.
56
57
58 hth,
59 James

Replies

Subject Author
[gentoo-user] Re: speech recognition? James <wireless@×××××××××××.com>