-
Notifications
You must be signed in to change notification settings - Fork 31
Description
Hello everyone! I'm interested in running your framework on my Blackpill board (STM32F411CEU6). I've noticed that I'm experiencing poor performance, and I'm wondering if I might be using your framework incorrectly or if this is expected behavior. Since there are no examples provided for embedded applications, I'm concerned that I may have made a mistake somewhere.
To provide some context, I've connected a PCM5102 DAC along with DMA to ensure that the MCU isn't overwhelmed with a simple I2S operation. Below, I've included the relevant portions of my code from the main.c file, along with comments for clarity
#include "leaf.h"
I2S_HandleTypeDef hi2s1;
// Constants
#define SAMPLERATE 44000
#define LEAF_BUFFER_SIZE (2*44100)
#define BUFFER_SIZE 8192
// Buffers
char mempool[10000]; // LEAF Memory pool
uint16_t samplebuffer[BUFFER_SIZE] = {0}; // Buffer, used for transmission to I2S codec with DMA
// Pointers, used for switching between buffers in DMA transfer
volatile uint16_t *current_buffer_element_ptr = samplebuffer;
volatile size_t current_buffer_element_cntr = 0;
// LEAF objects
LEAF leaf;
tCycle cycle;
tHermiteDelay delay;
// Utility functions
float rnd_func()
{
return ((float)rand() / (float)(RAND_MAX));
}
// Callbacks used for DMA transfers. When first part of buffer was sent(i2s_transfer_half_complited_callback called)
// I put current_buffer_element_ptr to the beginning and allow LEAF to fill it, in this time another half of buffer was sent to the
// DAC. And vise versa
void i2s_transfer_complited_callback(I2S_HandleTypeDef *hi2s)
{
if (current_buffer_element_cntr >= BUFFER_SIZE / 2)
{
current_buffer_element_ptr = samplebuffer + BUFFER_SIZE / 2;
current_buffer_element_cntr = 0;
} else {
printf("buffer overrun");
}
}
void i2s_transfer_half_complited_callback(I2S_HandleTypeDef *hi2s)
{
if (current_buffer_element_cntr >= BUFFER_SIZE / 2)
{
current_buffer_element_ptr = samplebuffer;
current_buffer_element_cntr = 0;
} else {
printf("buffer overrun");
}
}
int main(void)
{
// CUBEMX stuff
HAL_Init();
SystemClock_Config();
MX_GPIO_Init();
MX_DMA_Init();
MX_I2S1_Init();
MX_NVIC_Init();
// Callbacks for DMA transfer, where I switch buffers
HAL_I2S_RegisterCallback(&hi2s1, HAL_I2S_TX_COMPLETE_CB_ID, &i2s_transfer_complited_callback);
HAL_I2S_RegisterCallback(&hi2s1, HAL_I2S_TX_HALF_COMPLETE_CB_ID, &i2s_transfer_half_complited_callback);
HAL_I2S_Transmit_DMA(&hi2s1, samplebuffer, sizeof(samplebuffer)/sizeof(samplebuffer[0]));
// LEAF stuff init.
LEAF_init(&leaf, SAMPLERATE, mempool, LEAF_BUFFER_SIZE, &rnd_func);
tCycle_init(&cycle, &leaf);
tCycle_setFreq(&cycle, 220);
tHermiteDelay_init(&delay, 2000, 2500, &leaf);
tHermiteDelay_setGain(&delay, 0.5f);
uint64_t counter = 0;
while (1)
{
// If DMA controller succesfully finished transfer to audio codec, we can put new data there.
// This part work ok when simple stuff are done there.
if (current_buffer_element_cntr < BUFFER_SIZE / 2)
{
counter++;
if ((counter % 100000) == 10000)
tCycle_setFreq(&cycle, 220);
else if ((counter % 100000) == 20000)
tCycle_setFreq(&cycle, 330);
else if ((counter % 100000) == 30000)
tCycle_setFreq(&cycle, 220);
else if ((counter % 100000) == 40000)
tCycle_setFreq(&cycle, 0);
float processed_value = tCycle_tick(&cycle);
//float delayed_value = tHermiteDelay_tick(&delay, processed_value); // <<<< LOOK HERE
//processed_value = delayed_value; // <<<<< LOOK HERE
*(current_buffer_element_ptr + current_buffer_element_cntr) = (uint16_t) (0x0fff * (1.0f + processed_value));
current_buffer_element_cntr++;
}
}
}
We can assume that the code is functioning correctly. I have provided a recording of the sound when the sequence is running as expected:
Record - sequence, works ok
Next, I uncommented a section marked as "<<<< LOOK HERE." This enabled a delay for the audio, and as a result, the sound became severely distorted:
Record - sequence + delay, broken
I also tested similar code on a host machine (you can find it in my fork and example: https://github.com/leechwort/LEAF/blob/master/Examples/sawtooth-sequence.c) and it worked ok. This leads me to suspect that the issue might be related to performance limitations on the STM32 board.
In summary, I have a few questions:
- Can you suggest what might be causing this behavior on the STM32 board? Is there a specific way I should be using your framework for embedded systems that differs from using it on a host machine?
- Do you have any example projects specifically designed for the STM32 platform that I could refer to for guidance?
- It appears that the FPU (Floating-Point Unit) is not utilized in this framework. Do you have plans to implement FPU support in the future?