Preventing AVR’s EEPROM Corruption

I have been observing rhythm patterns being corrupted intermittently while I develop the firmware of the 909 clone. I first thought it is due to some bug in my firmware, but the occurrences are too random for it. I started thinking that it is some hardware problem in the processor, so looked this issue closely. I noticed that the corruptions happen only when I power cycle the processor. It’s a good clue. I searched around to know what this problem is and how to prevent it.

Continue reading

Asynchronous EEPROM Write Function for AVR

The TR-909 clone firmware I’m currently developing writes sequence patterns to eeprom. The avr-libc library provides eeprom writing function, but it is synchronous.

https://www.nongnu.org/avr-libc/user-manual/group__avr__eeprom.html

However, many of the real-time processing in the firmware are done without interrupts, then these processes stall during the slow (synchronous) function. So I decided to develop the asynchronous version of eeprom writing function by my own.

Continue reading

Implementing a Decay Curve Generator into Microprocessor

In the previous article, I considered how to calculate a decay curve without using a lookup table. I tried implementing it to a microprocessor (atmega168). But I hit a few obstacles.

First I planned to keep envelope value as a Q10.6 fixed point number (meaning 10 integer bits and 6 fractional bits), so that I can conveniently use a 16-bit integer. But I found it does not give enough resolution of the data. I needed to extend it to 32-bit (Q10.22). Integer part is enough with 10 bit because it’s the resolution of the PWM output.

Then I wrote a program like following (shortened from the actual program). It looks to work, but it did not. The value did not show up to g_value where I keep the current envelope amount.

volatile uint32_t g_value;  // value holder, Q10.22 fixed point number
volatile uint32_t g_decay_factor;  // amount of decay,  Q0.32 fixed point
volatile uint8_t g_update_ready;

ISR(Time1_OVF_vect) {
  OCR1A = g_value >> 22;  // reflect the value to PWM、Q10.22 to integer
  g_update_ready = 1;
}

void UpdateValue() {
  if (!update_ready) {
    return;
  }
  // reduce the value by multiplying the decay factor
  uint64_t temp = g_value;
  temp *= g_decay_factor;
  g_value = temp >> 32;
  g_update_ready = false;
}

int main() {
  g_value = 1023 << 22; // integer to Q10.22
  g_decay_factor = 0.99309249543703590153321021688807 * 0x100000000;
  g_update_ready = 0;

  while (true) {
    UpdateValue();
  }
}

I couldn’t find any document or article explaining this, but it seems that avr-gcc, the compiler for AVR, is not following C language standard strictly. I presume it’s because there is some necessity to control register usage tightly by way of writing code (but I don’t know whether I’m right. I, again, couldn’t find any doc yet). I modified the code as follows, then it started working.

volatile uint32_t g_value;  // value holder, Q10.22 fixed point number
volatile uint32_t g_decay_factor;  // amount of decay,  Q0.32 fixed point
volatile uint8_t g_update_ready;

ISR(Time1_OVF_vect) {
  register uint32_t temp = g_value;  // copy the Q10.22 value to a variable
  temp >>= 22;  // Q10.22 to integer
  OCR1A = temp;  // reflect the value to PWM
  g_update_ready = 1;
}

void UpdateValue() {
  if (!update_ready) {
    return;
  }
  register uint64_t temp = g_value;
  // copy the Q10.22 value to a 64-bit int
  temp *= g_decay_factor;
  // multiply by the Q0.32 decay factor
  temp >>= 32;  // shift 32 bit to adjust scale, now it's Q10.22 again
  g_value = temp;
  // copy back to the value holder
  g_update_ready = false;
}

int main() {
  register uint32_t temp = 1023 << 22;   // this is calculated by the compiler
  g_value = temp;
  g_decay_factor = 0.99309249543703590153321021688807 * 0x100000000;
  g_update_ready = 0;

  while (true) {
    UpdateValue();
  }
}

The program works even without the “register” keywords. The compiler seems to take care of it.

The key points in the second implementation are:

  • Clarify where to load the data (such as g_value to temp)
  • Clarify where a calculation happens (use compound assignment operators such as *=, <<=, and >>=, rather than binary operators (*, <<, >>, etc.)
  • Do not mix assignment (loading data) and calculation (modifying the data on the register) in a single line

I have no idea which point was effective. But I keep this note hoping it helps when I have similar issue with microprocessor programming in future.

Calculating a Decaying Exponential Curve Recursively by using Fixed Point Multiplication

I’ve been working on implementing Envelope Generator into microprocessors. Output of an Envelope Generator consists of rising and falling exponential curve. I’m currently doing table lookup to generate those curves.

https://github.com/naokiiwakami/vceg

This is a quick and straightforward approach, and you don’t have to be worried much about data resolution. But the program is a little complicated, and memory consuming. I don’t really like this approach. I don’t really like the approach, so I started to think about alternative approaches.

Mathematically, a exponentially decaying curve can be calculated recursively by following formula:

V(i+1) = V(i) * k

where 0 <= k < 1

This is very simple. You can program this such as:

double value = INITIAL_VALUE;
double decay_factor = 0.99309249543703590153321021688807;
while (true) {
  value *= decay_factor;
}

but implementing the algorithm into a microprocessor is not straightforward like this.

I’m planning to use an AVR (e.g. atmega168), it has limited number of registers, and floating point multiplication is generally slow in AVR. So I would like to avoid floating point data types.

So shall I go for simple integer calculation? It’s actually not feasible. For example, if I update the value in every 10ms (100 updates per second) and would like to make a decaying curve that reduces its value to 50% after 1 second, the decay factor k would be

k = 0.5 ^ (1 / 100) = 0.99309249543703590153321021688807

It’s less than 1 so cannot be expressed by an integer.

There’s a technique called fixed point multiplication. This fundamentally uses integers but can handle fractions. This is my first time to try this technique, so learned how you do it. I found following link a good instruction:

https://www.allaboutcircuits.com/technical-articles/multiplication-examples-using-the-fixed-point-representation/

Tried to write it in C.

#include <stdio.h>
#include <stdint.h>

int main(int argc, char *argv[]) {
  // fixed point integer 10bit.6bit
  uint16_t value = (1023) << 6;
  // fixed point integer .16bit
  uint16_t decay_factor = 0.99309249543703590153321021688807 * 65536;
  for (uint16_t i = 0; i < 1000; ++i) {
    // multiplication
    uint32_t temp = value * decay_factor;
    value = temp >> 16;
    // retrieve the integer part
    uint16_t output = value >> 6;
    printf("%d %d\n", i, output, 10);
  }
  return 0;
}

Ran it on my PC (not on the microprocessor yet). Plotted the result to a graph:

OK it looks good so far.

Where should I connect the USB Connector Shield, or should I not?

I’m about to start wiring USB socket to experiment MIDI USB. But first, I wonder where I should connect the connector shield to. Should I connect the case to GND or should I do otherwise?

I found a good thread in Electrical Engineering Stack Exchange.

tl;dr: Never connect to GND. Leave it unconnected or connect to the shield of the rig. I should take the former in my current project.

Although it’s long, this thread is pretty interesting so is worth reading.

DC-DC Booster Design using MC34063

DR-110 that I am modifying currently takes 6V power supply from 4 AA batteries or 9V from an AC adapter. But since I’m thinking of connecting to USB, it’s convenient if we can draw power from it. From the device circuit diagram, I found the device requires at least 6V supply although USB supply power is 5V. It’s a good opportunity to learn how to use DC-DC booster since more project that uses USB will come.

I live near Jameco. It’s the best to use one they carry. It’s even better if the package is DIP since I often use breadboards. I checked their catalog and found a switching regulator called MC34063 is cheap and has plenty of inventory. It is versatile, too, that supports boosting, busting, and negating the power supply. I think this is a good choice for my regular switching regulator.

For DR-110, I decided to put 7V that is slightly higher than battery power. The circuit has a simple linear regulator. I expect it removes noise in the switching regulator output as long as it is small. The current supply probably is good enough with 50mA.

Collecting document is necessary first. Jameco provides MC34063 manufactured by STMicro, but their datasheet is simplified and vague. I cannot understand how to use it for lack of knowledge of switching regulator fundamentals. I also found decent application notes from ON Semiconductor and TI sites. They looked to have the same source and the contents look the same.

Following is an example circuit diagram of boost regulator. The circuit has a n oscillator that drives switch Q1 and Q2 that draws current from L to accumulate energy to boost power supply. The output voltage is determined by comparator that listens on output and adapts duty cycle of the switch in negative feedback manner. Well, it’s a smart design.

First, I used this circuit as is with modifying only R2 to make 7V output, but it generates a lot of audible ripple noises in the output. This approach was useless. I should took proper way that is to calculate circuit parameters by reading and understanding application notes. The application notes has enough information to design the circuit.

The frequency of the oscillator is determined by CT, and other parts hardly affect it. I decided the frequency first (a little lower than 100 kHz that is highest possible frequency). Then decided the switching duty cycle. Finally decided minimum L value. Tried the circuit using simulator and tweaked the parameters. The diagram below is the designed parameters. The resistor R4 is a load.

I found one difficulty in designing a switching regulator is that the behavior changes significantly when the load changes. Analog circuit usually does not change its circuit current drastically, so it might be better to figure it out when attaching a switching regulator. Following picture shows the behavior change when the load is very light (100 kΩ). Switching timings are much slower. Slow ripples are clearly visible, which must be so bad to an analog audio circuit. It’s better to experiment with real circuit to proceed further. But I don’t have a 10 μH inductor right now. So that’s it for today.

DR-110 Driving Pulses

The schematic of DR-110 is:

For this circuit, each percussion element seems to sound when you put a negative pulse of width 1 ms. I’ve put oscilloscope to observe the pulses come as expected.

Here are the pulses to the bass drum and output:

Blue: Input pulse, Red: Output
Blue: Input pulse, Red: Output

The pulse width looks to be around 1.3 ms.

These are the pulses to the snare drum and output:

Blue: Input pulse: Red: Output
Blue: Input pulse, Red: Output

For the hand clap, I only took pulses:

Blue: CP I, Red: CP II

This is a simulated output of the bass drum circuit:

The output would be weaker when the pulse is narrower:

A wider pulse gives a stronger output, but the waveform starts distorting significantly at certain point.

Started Modifying DR-110

It’s been a while since I stopped building things. But it’s a new year, I’ll try restarting it. For not making things for long, I cannot do anything aggressive yet. I would start with picking up an old DR-110 from my toy box to modify it.

This device is half-dead. Upper half of the display is not functioning. Membrane switches are not working well, too. The case is nice looking, so it may be fun to keep it and make it work. But it’s an old device and what can be done is pretty limited compared to contemporary feature rich devices. I would go simply with replacing the control by MIDI to make it done quickly.

DR-110 has four banks of memorable rhythm patterns from A to D. C and D out of them are preset so you can play these patterns without programming. But banks A and B are not usable since I cannot punch the patterns in because of the broken display. These banks C, and D, still plays nicely:

DR-110 Patterns (Bank C)
DR-110 Patterns (Bank D)

The sound is a bit old school, but I love it.

This is the starting point. I begin with investigating internals of the device.

Building DPDK on Raspberry Pi 3

I’m trying to build DPDK on Raspberry Pi 3 but couldn’t do it on Raspbian since it’s a 32-bit OS. DPDK uses several assembler instructions for ARM 8 that are only valid on 64-bit OS. So my effort starts with installing a 64-bit OS.

I found SUSE has released 64-bit Linux for Raspberry Pi 3:
https://en.opensuse.org/HCL:Raspberry_Pi3

I started with this OS image.

Choosing the Image

Three versions of distributions are avalable for Raspberry Pi 3:

  • openSUSE Leap
  • openSUSE Tumbleweed
  • non-upstream openSUSE Tumbleweed

I tried openSUSE Leap first, but could not zypper update. Then I tried openSUSE Tumbleweed. This worked fine.

There are four variations of images:

  • JeOS image (Just Enough OS)
  • E20 image
  • XFCE image
  • LXQT image
  • X11 image

I wanted to save disk space, so took JeOS.

Installation was done flawlessly. I’m using wired network to avoid spending time for WiFi setup.

Build Preparation

The system is missing compiler and so on. I first needed to install development tools. After logging in as root, I executed following.

# zypper refresh
# zypper update
# zypper install -t pattern devel_C_C++
# zypper install emacs libnuma-devel ncurses-devel kernel-source man

Then the system is ready for building DPDK.

Building DPDK

# see http://dpdk.org/dev
# and http://dpdk.org/doc/guides/linux_gsg/build_dpdk.html
$ git clone git://dpdk.org/dpdk
$ cd dpdk
$ make install T=arm64-armv8a-linuxapp-gcc

Then compiler has passed, but…

Test Run

localhost:/home/naoki/workspace/dpdk/arm64-armv8a-linuxapp-gcc/app # ./testpmd 
ERROR: This system does not support "AES".
Please check that RTE_MACHINE is set correctly.
EAL: FATAL: unsupported cpu type.
EAL: unsupported cpu type.
PANIC in main():
Cannot init EAL
1: [./testpmd(rte_dump_stack+0x1c) [0x4c1854]]
Aborted (core dumped)

The CPU does not seem to support AES (Advanced Encryption Standard) instructions:

localhost:/home/naoki/workspace/dpdk/arm64-armv8a-linuxapp-gcc/app # lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  1
Core(s) per socket:  4
Socket(s):           1
NUMA node(s):        1
Model:               4
BogoMIPS:            38.40
NUMA node0 CPU(s):   0-3
Flags:               fp asimd evtstrm crc32 cpuid

The CPU of Raspberry Pi 3 is Cortex-A53. The Wikipedia page for Arm architecture sounds like Cortex-A53 supports AES instructions, but according to Arm documentation, it seems to be up to implementation. So, I think AES instructions are unsupported by Raspberry Pi 3. How does DPDK think it’s supported then?

I found following code snippet in part of makefiles:

AUTO_CPUFLAGS := $(shell $(CC) $(MACHINE_CFLAGS) $(WERROR_FLAGS) $(EXTRA_CFLAGS) -dM -E - < /dev/null)

...

ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_FEATURE_CRYPTO),)
CPUFLAGS += AES
CPUFLAGS += PMULL
CPUFLAGS += SHA1
CPUFLAGS += SHA2
endif

So, gcc returns flag __ARM_FEATURE_CRYPTO that indicates encryption instructions support. But as seen in the previous lscpu output, the CPU does not support it. I guess the compiler gives wrong information because it’s a binary install via zypper. The compiler probably would return correct flags if built from source code, but it’s time consuming. Instead of building gcc, I put a quick workaround to DPDK as follows and rebuilt DPDK.

diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk
index 500421a20..c83045b84 100644
--- a/mk/machine/armv8a/rte.vars.mk
+++ b/mk/machine/armv8a/rte.vars.mk
@@ -55,4 +55,4 @@
 # CPU_LDFLAGS =
 # CPU_ASFLAGS =
 
-MACHINE_CFLAGS += -march=armv8-a+crc+crypto
+MACHINE_CFLAGS += -march=armv8-a+crc
diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
index a813c91f4..cb899e86c 100644
--- a/mk/rte.cpuflags.mk
+++ b/mk/rte.cpuflags.mk
@@ -125,12 +125,12 @@ ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_FEATURE_CRC32),)
 CPUFLAGS += CRC32
 endif
 
-ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_FEATURE_CRYPTO),)
-CPUFLAGS += AES
-CPUFLAGS += PMULL
-CPUFLAGS += SHA1
-CPUFLAGS += SHA2
-endif
+# ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_FEATURE_CRYPTO),)
+# CPUFLAGS += AES
+# CPUFLAGS += PMULL
+# CPUFLAGS += SHA1
+# CPUFLAGS += SHA2
+# endif
 
 MACHINE_CFLAGS += $(addprefix -DRTE_MACHINE_CPUFLAG_,$(CPUFLAGS))

In this time, execution looks better, although it failed again.

localhost:/home/naoki/workspace/dpdk/arm64-armv8a-linuxapp-gcc/app # ./testpmd 
EAL: Detected 4 lcore(s)
EAL: Probing VFIO support...
EAL: No probed ethernet devices
USER1: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0
EAL: Error - exiting with code: 1
  Cause: Creation of mbuf pool for socket 0 failed: Cannot allocate memory

I think this is because I haven’t setup huge TLB fs yet.

(To be continued)

PS: This may be helpful later:

https://www.suse.com/communities/blog/compiling-de-linux-kernel-suse-way/