[edk2-devel] VirtIO Sound Driver (GSoC 2021)

Sat Apr 17 18:04:55 UTC 2021

On 17.04.21 19:31, Andrew Fish via groups.io wrote:
>
>
>> On Apr 17, 2021, at 9:51 AM, Marvin Häuser <mhaeuser at posteo.de 
>> <mailto:mhaeuser at posteo.de>> wrote:
>>
>> On 16.04.21 19:45, Ethin Probst wrote:
>>> Yes, three APIs (maybe like this) would work well:
>>> - Start, Stop: begin playback of a stream
>>> - SetVolume, GetVolume, Mute, Unmute: control volume of output and 
>>> enable muting
>>> - CreateStream, ReleaseStream, SetStreamSampleRate: Control sample
>>> rate of stream (but not sample format since Signed 16-bit PCM is
>>> enough)
>>> Marvin, how do you suggest we make the events then? We need some way
>>> of notifying the caller that the stream has concluded. We could make
>>> the driver create the event and pass it back to the caller as an
>>> event, but you'd still have dangling pointers (this is C, after all).
>>> We could just make a IsPlaying() function and WaitForCompletion()
>>> function and allow the driver to do the event handling -- would that
>>> work?
>>
>> I do not know enough about the possible use-cases to tell. Aside from 
>> the two functions you already mentioned, you could also take in an 
>> (optional) notification function.
>> Which possible use-cases does determining playback end have? If it's 
>> too much effort, just use EFI_EVENT I guess, just the less code can 
>> mess it up, the better.
>>
>
> In UEFI EFI_EVENT works much better. There is a gBS-WaitForEvent() 
> function that lets a caller wait on an event. That is basically what 
> the UEFI Shell is doing at the Shell prompt. A GUI in UEFI/C is 
> basically an event loop.
>
> Fun fact: I ended up adding gIdleLoopEventGuid to the MdeModulePkg so 
> the DXE Core could signal gIdleLoopEventGuid if you are sitting in 
> gBS-WaitForEvent() and no event is signaled. Basically in EFI nothing 
> is going to happen until the next timer tick so the gIdleLoopEventGuid 
> lets you idle the CPU until the next timer tick. I was forced to do 
> this as the 1st MacBook Air had a bad habit of thermal tripping when 
> sitting at the UEFI Shell prompt. After all another name for a loop in 
> C code running on bare metal is a power virus.

Mac EFI is one of the best implementations we know of, frankly. I'm 
traumatised by Aptio 4 and alike, where (some issues are OEM-specific I 
think) you can have timer events signalling after ExitBS, there is event 
clutter on IO polling to the point where everything lags no matter what 
you do, and even in "smooth" scenarios there may be nothing worth the 
description "granularity" (events scheduled to run every 10 ms may run 
every 50 ms). Events are the last resort for us, if there really is no 
other way. My first GUI implementation worked without events at all for 
this reason, but as our workarounds got better, we did start using them 
for keyboard and mouse polling.

Timers do not apply here, but what does apply is resource management. 
Using EFI_EVENT directly means (to the outside) the introduction of a 
new resource to maintain, for each caller separately. On the other side, 
there is no resource to misuse or leak if none such is exposed. Yet, if 
you argue with APIs like WaitForEvent, something has to signal it. In a 
simple environment this would mean, some timer event is running and may 
signal the event the main code waits for, where above's concern actually 
do apply. :) Again, the recommendation assumes the use-cases are simple 
enough to easily avoid them.

I think it would be best to sketch use-cases for audio and design the 
solutions closely to the requirements. Why do we need to know when audio 
finished? What will happen when we queue audio twice? There are many 
layers (UX, interface, implementation details) of questions to coming up 
with a pleasant and stable design.

Best regards,
Marvin

>
> Thanks,
>
> Andrew Fish.
>
>> If I remember correctly you mentioned the UEFI Talkbox before, if 
>> that is more convenient for you, I'm there as mhaeuser.
>>
>> Best regards,
>> Marvin
>>
>>>
>>> On 4/16/21, Andrew Fish <afish at apple.com <mailto:afish at apple.com>> 
>>> wrote:
>>>>
>>>>> On Apr 16, 2021, at 4:34 AM, Leif Lindholm <leif at nuviainc.com 
>>>>> <mailto:leif at nuviainc.com>> wrote:
>>>>>
>>>>> Hi Ethin,
>>>>>
>>>>> I think we also want to have a SetMode function, even if we don't get
>>>>> around to implement proper support for it as part of GSoC (although I
>>>>> expect at least for virtio, that should be pretty straightforward).
>>>>>
>>>> Leif,
>>>>
>>>> I’m think if we have an API to load the buffer and a 2nd API to 
>>>> play the
>>>> buffer an optional 3rd API could configure the streams.
>>>>
>>>>> It's quite likely that speech for UI would be stored as 8kHz (or
>>>>> 20kHz) in some systems, whereas the example for playing a tune in GRUB
>>>>> would more likely be a 44.1 kHz mp3/wav/ogg/flac.
>>>>>
>>>>> For the GSoC project, I think it would be quite reasonable to
>>>>> pre-generate pure PCM streams for testing rather than decoding
>>>>> anything on the fly.
>>>>>
>>>>> Porting/writing decoders is really a separate task from enabling the
>>>>> output. I would much rather see USB *and* HDA support able to play pcm
>>>>> streams before worrying about decoding.
>>>>>
>>>> I agree it might turn out it is easier to have the text to speech 
>>>> code just
>>>> encode a PCM directly.
>>>>
>>>> Thanks,
>>>>
>>>> Andrew Fish
>>>>
>>>>> /
>>>>>    Leif
>>>>>
>>>>> On Fri, Apr 16, 2021 at 00:33:06 -0500, Ethin Probst wrote:
>>>>>> Thanks for that explanation (I missed Mike's message). Earlier I sent
>>>>>> a summary of those things that we can agree on: mainly, that we have
>>>>>> mute, volume control, a load buffer, (maybe) an unload buffer, and a
>>>>>> start/stop stream function. Now that I fully understand the
>>>>>> ramifications of this I don't mind settling for a specific format and
>>>>>> sample rate, and signed 16-bit PCM audio is, I think, the most widely
>>>>>> used one out there, besides 64-bit floating point samples, which I've
>>>>>> only seen used in DAWs, and that's something we don't need.
>>>>>> Are you sure you want the firmware itself to handle the decoding of
>>>>>> WAV audio? I can make a library class for that, but I'll definitely
>>>>>> need help with the security aspect.
>>>>>>
>>>>>> On 4/16/21, Andrew Fish via groups.io <http://groups.io> 
>>>>>> <afish=apple.com at groups.io <mailto:afish=apple.com at groups.io>> wrote:
>>>>>>>
>>>>>>>> On Apr 15, 2021, at 5:59 PM, Michael Brown <mcb30 at ipxe.org 
>>>>>>>> <mailto:mcb30 at ipxe.org>> wrote:
>>>>>>>>
>>>>>>>> On 16/04/2021 00:42, Ethin Probst wrote:
>>>>>>>>> Forcing a particular channel mapping, sample rate and sample 
>>>>>>>>> format
>>>>>>>>> on
>>>>>>>>> everyone would complicate application code. From an 
>>>>>>>>> application point
>>>>>>>>> of view, one would, with that type of protocol, need to do the
>>>>>>>>> following:
>>>>>>>>> 1) Load an audio file in any audio file format from any storage
>>>>>>>>> mechanism.
>>>>>>>>> 2) Decode the audio file format to extract the samples and audio
>>>>>>>>> metadata.
>>>>>>>>> 3) Resample the (now decoded) audio samples and convert (quantize)
>>>>>>>>> the
>>>>>>>>> audio samples into signed 16-bit PCM audio.
>>>>>>>>> 4) forward the samples onto the EFI audio protocol.
>>>>>>>> You have made an incorrect assumption that there exists a 
>>>>>>>> requirement
>>>>>>>> to
>>>>>>>> be able to play audio files in arbitrary formats.  This requirement
>>>>>>>> does
>>>>>>>> not exist.
>>>>>>>>
>>>>>>>> With a protocol-mandated fixed baseline set of audio parameters
>>>>>>>> (sample
>>>>>>>> rate etc), what would happen in practice is that the audio 
>>>>>>>> files would
>>>>>>>> be
>>>>>>>> encoded in that format at *build* time, using tools entirely 
>>>>>>>> external
>>>>>>>> to
>>>>>>>> UEFI.  The application code is then trivially simple: it just does
>>>>>>>> "load
>>>>>>>> blob, pass blob to audio protocol".
>>>>>>>>
>>>>>>>
>>>>>>> Ethin,
>>>>>>>
>>>>>>> Given the goal is an industry standard we value interoperability 
>>>>>>> more
>>>>>>> that
>>>>>>> flexibility.
>>>>>>>
>>>>>>> How about another use case. Lets say the Linux OS loader (Grub) 
>>>>>>> wants
>>>>>>> to
>>>>>>> have an accessible UI so it decides to sore sound files on the EFI
>>>>>>> System
>>>>>>> Partition and use our new fancy UEFI Audio Protocol to add audio 
>>>>>>> to the
>>>>>>> OS
>>>>>>> loader GUI. So that version of Grub needs to work on 1,000 of 
>>>>>>> different
>>>>>>> PCs
>>>>>>> and a wide range of UEFI Audio driver implementations. It is a much
>>>>>>> easier
>>>>>>> world if Wave PCM 16 bit just works every place. You could add a 
>>>>>>> lot of
>>>>>>> complexity and try to encode the audio on the fly, maybe even in 
>>>>>>> Linux
>>>>>>> proper but that falls down if you are booting from read only 
>>>>>>> media like
>>>>>>> a
>>>>>>> DVD or backup tape (yes people still do that in server land).
>>>>>>>
>>>>>>> The other problem with flexibility is you just made the test matrix
>>>>>>> very
>>>>>>> large for every driver that needs to get implemented. For 
>>>>>>> something as
>>>>>>> complex as Intel HDA how you hook up the hardware and what 
>>>>>>> CODECs you
>>>>>>> use
>>>>>>> may impact the quality of the playback for a given board. Your 
>>>>>>> EFI is
>>>>>>> likely
>>>>>>> going to pick a single encoding at that will get tested all the 
>>>>>>> time if
>>>>>>> your
>>>>>>> system has audio, but all 50 other things you support not so 
>>>>>>> much. So
>>>>>>> that
>>>>>>> will required testing, and some one with audiophile ears (or an AI
>>>>>>> program)
>>>>>>> to test all the combinations. I’m not kidding I get BZs on the 
>>>>>>> quality
>>>>>>> of
>>>>>>> the boot bong on our systems.
>>>>>>>
>>>>>>>
>>>>>>>>> typedef struct EFI_SIMPLE_AUDIO_PROTOCOL {
>>>>>>>>>  EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset;
>>>>>>>>>  EFI_SIMPLE_AUDIO_PROTOCOL_START Start;
>>>>>>>>>  EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop;
>>>>>>>>> } EFI_SIMPLE_AUDIO_PROTOCOL;
>>>>>>>> This is now starting to look like something that belongs in 
>>>>>>>> boot-time
>>>>>>>> firmware.  :)
>>>>>>>>
>>>>>>> I think that got a little too simple I’d go back and look at the 
>>>>>>> example
>>>>>>> I
>>>>>>> posted to the thread but add an API to load the buffer, and then 
>>>>>>> play
>>>>>>> the
>>>>>>> buffer (that way we can an API in the future to twiddle knobs). That
>>>>>>> API
>>>>>>> also implements the async EFI interface. Trust me the 1st thing 
>>>>>>> that is
>>>>>>> going to happen when we add audio is some one is going to 
>>>>>>> complain in
>>>>>>> xyz
>>>>>>> state we should mute audio, or we should honer audio volume and mute
>>>>>>> settings from setup, or from values set in the OS. Or some one 
>>>>>>> is going
>>>>>>> to
>>>>>>> want the volume keys on the keyboard to work in EFI.
>>>>>>>
>>>>>>> Also if you need to pick apart the Wave PCM 16 byte file to feed 
>>>>>>> it into
>>>>>>> the
>>>>>>> audio hardware that probably means we should have a library that 
>>>>>>> does
>>>>>>> that
>>>>>>> work, so other Audio drivers can share that code. Also having a 
>>>>>>> library
>>>>>>> makes it easier to write a unit test. We need to be security 
>>>>>>> conscious
>>>>>>> as we
>>>>>>> need to treat the Audo file as attacker controlled data.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Andrew Fish
>>>>>>>
>>>>>>>> Michael
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Signed,
>>>>>> Ethin D. Probst
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>
> 

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#74230): https://edk2.groups.io/g/devel/message/74230
Mute This Topic: https://groups.io/mt/81710286/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-