2018年08月31日

Ultra96用Yocto Linuxのビルド

Ultra96 のDisplayPort がベアメタルで動かないし、めどが立たないので、Ultra96 用のYocto Linux をビルドしてみることにした。参考にするのは、ひでみさんの薄い本の「超苦労したFPGAの薄い本 Yocto Projectと立ち上げ編」だ。なお、「超苦労したFPGAの薄い本（高位合成とリコンフィグ編）」もあって、そちらも購入済みだ。

さて、Ultra96 用のYocto をビルドしよう。なお、手順は「超苦労したFPGAの薄い本 Yocto Projectと立ち上げ編」に書いてあるので、詳しくは書かない。本を参照のこと。

download_meta.sh を実行して、ダウンロードし、poky ディレクトリに cd して、
source ./oe-init-build-env ../build_zynqmp
を実行した。（ここは、本の記述と違っている）

bblayers.conf と local.conf を書き換えて
bitbake core-image-minimal
でYocto をビルドした。
その結果、bblayers.conf の

DISTRO_FEATURES_BACKFILL_CONSIDERED = "sysvinit"
IMAGE_INSTALL_append = " dhcp"

をコメントアウトしないとビルドが通らなかった。

ビルドが順調に通っているようなので、時間かかると思って寝た。そして、次の朝に起きてみたらUbuntu が再起動していた。
どうしたこと？ということで、ログインして再度
source ./oe-init_build-env ../build_zynqmp
bitbake core-image-minimal
したら、ビルドは終了していたみたいだ。

masaaki@masaaki-H110M4-M01:~/Ultra96_Yocto/poky$ source ./oe-init-build-env ../build_zynqmp

### Shell environment set up for builds. ###

You can now run 'bitbake <target>'

Common targets are:
    core-image-minimal
    core-image-sato
    meta-toolchain
    meta-ide-support

You can also run generated qemu images with a command like 'runqemu qemux86'
masaaki@masaaki-H110M4-M01:~/Ultra96_Yocto/build_zynqmp$ bitbake core-image-minimal
Loading cache: 100% |############################################| Time: 0:00:00
Loaded 2102 entries from dependency cache.
Parsing recipes: 100% |##########################################| Time: 0:00:00
Parsing of 1470 .bb files complete (1469 cached, 1 parsed). 2103 targets, 321 skipped, 0 masked, 0 errors.
NOTE: Resolving any missing task queue dependencies

Build Configuration:
BB_VERSION           = "1.38.0"
BUILD_SYS            = "x86_64-linux"
NATIVELSBSTRING      = "universal"
TARGET_SYS           = "aarch64-poky-linux"
MACHINE              = "zcu102-zynqmp"
DISTRO               = "poky"
DISTRO_VERSION       = "2.5.1"
TUNE_FEATURES        = "aarch64"
TARGET_FPU           = ""
meta
meta-poky
meta-yocto-bsp       = "sumo:45ef387cc54a0584807e05a952e1e4681ec4c664"
meta-oe              = "sumo:b0950aeff5b630256bb5e25ca15f4d59c115e7c1"
meta-xilinx-bsp
meta-xilinx-contrib  = "sumo:5fccc46503e468ed024185ed032891799a31db58"

Initialising tasks: 100% |#######################################| Time: 0:00:02
NOTE: Executing SetScene Tasks
NOTE: Executing RunQueue Tasks
WARNING: core-image-minimal-1.0-r0 do_image_wic: Manifest /home/masaaki/Ultra96_Yocto/build_zynqmp/tmp/sstate-control/manifest-x86_64_aarch64-zynqmp-pmu-gcc-cross-microblazeel.populate_sysroot not found in x86_64_aarch64 (variant '')?
WARNING: core-image-minimal-1.0-r0 do_image_wic: Manifest /home/masaaki/Ultra96_Yocto/build_zynqmp/tmp/sstate-control/manifest-x86_64_aarch64-zynqmp-pmu-binutils-cross-microblazeel.populate_sysroot not found in x86_64_aarch64 (variant '')?
WARNING: core-image-minimal-1.0-r0 do_image_complete: Manifest /home/masaaki/Ultra96_Yocto/build_zynqmp/tmp/sstate-control/manifest-x86_64_aarch64-zynqmp-pmu-gcc-cross-microblazeel.populate_sysroot not found in x86_64_aarch64 (variant '')?
WARNING: core-image-minimal-1.0-r0 do_image_complete: Manifest /home/masaaki/Ultra96_Yocto/build_zynqmp/tmp/sstate-control/manifest-x86_64_aarch64-zynqmp-pmu-binutils-cross-microblazeel.populate_sysroot not found in x86_64_aarch64 (variant '')?
NOTE: Tasks Summary: Attempted 3454 tasks of which 2725 didn't need to be rerun and all succeeded.

Summary: There were 4 WARNING messages shown.

ビルドの生成物を示す。

Image
Image--4.14-xilinx-v2018.1+git0+4ac76ffacb-r0-zcu102-zynqmp-20180830124849.bin
Image--4.14-xilinx-v2018.1+git0+4ac76ffacb-r0-zynqmp-zcu102-rev1.0-20180830124849.dtb
Image-zcu102-zynqmp.bin
Image-zynqmp-zcu102-rev1.0.dtb
arm-trusted-firmware--1.4-xilinx-v2018.1+gitAUTOINC+df4a7e97d5-r0-20180830180835.bin
arm-trusted-firmware--1.4-xilinx-v2018.1+gitAUTOINC+df4a7e97d5-r0-20180830180835.elf
arm-trusted-firmware--1.4-xilinx-v2018.1+gitAUTOINC+df4a7e97d5-r0-20180830180835.ub
arm-trusted-firmware.bin
arm-trusted-firmware.elf
arm-trusted-firmware.ub
atf-uboot.ub
boot.bin
boot.bin-zcu102-zynqmp
boot.bin-zcu102-zynqmp-v2018.01-xilinx-v2018.1+gitAUTOINC+949e5cb9a7-r0
core-image-minimal-zcu102-zynqmp-20180830180835.qemuboot.conf
core-image-minimal-zcu102-zynqmp-20180830180835.rootfs.cramfs
core-image-minimal-zcu102-zynqmp-20180830180835.rootfs.manifest
core-image-minimal-zcu102-zynqmp-20180830180835.rootfs.tar.gz
core-image-minimal-zcu102-zynqmp-20180830180835.rootfs.wic.qemu-sd
core-image-minimal-zcu102-zynqmp-20180830180835.testdata.json
core-image-minimal-zcu102-zynqmp.cramfs
core-image-minimal-zcu102-zynqmp.manifest
core-image-minimal-zcu102-zynqmp.qemuboot.conf
core-image-minimal-zcu102-zynqmp.tar.gz
core-image-minimal-zcu102-zynqmp.testdata.json
core-image-minimal-zcu102-zynqmp.wic.qemu-sd
modules--4.14-xilinx-v2018.1+git0+4ac76ffacb-r0-zcu102-zynqmp-20180830124849.tgz
modules-zcu102-zynqmp.tgz
pmu-firmware--v2018.1+gitAUTOINC+aaa566bc3f-r0-zcu102-zynqmp-20180830180835.bin
pmu-firmware--v2018.1+gitAUTOINC+aaa566bc3f-r0-zcu102-zynqmp-20180830180835.elf
pmu-firmware-zcu102-zynqmp.bin
pmu-firmware-zcu102-zynqmp.elf
pmu-zcu102-zynqmp.bin
pmu-zcu102-zynqmp.elf
qemu-hw-devicetrees
u-boot-zcu102-zynqmp-v2018.01-xilinx-v2018.1+gitAUTOINC+949e5cb9a7-r0.bin
u-boot-zcu102-zynqmp-v2018.01-xilinx-v2018.1+gitAUTOINC+949e5cb9a7-r0.elf
u-boot-zcu102-zynqmp.bin
u-boot-zcu102-zynqmp.elf
u-boot.bin
u-boot.elf
uEnv.txt
zynqmp-zcu102-rev1.0.dtb

2018年08月30日

Duplicate IP の製作2（Vivado HLS プロジェクト）

”Duplicate IP の製作1（ソースコード）”の続き。

前回は、1 つの HLS ストリームを 2 つにするDuplicate IP のソースコードを示した。今回は、そのソースコードを使用して、Vivado HLS 2018.2 のプロジェクトを作成して、IP 化まで行う。

Vivado HLS 2018.2 の duplicate プロジェクトを示す。

C シミュレーションを行った。結果を示す。

No Error だった。２つのHLS ストリームが同一だった。

C コードの合成を行った。

Estimated は 5.403 ns で十分だ。
Latency も 784 ピクセルなので、 790 クロックということは、ほとんど 1 ピクセル / クロックになっている。
リソース使用量は、FF が 106 個、LUT が 464 個だった。
次に、Detail -> Instance の grp_split_template_fu_186 をクリックして、split_template の結果を見てみよう。

こちらのLatency は 789 クロックだった。
リソース使用量は、FF が 103 個、LUT が 341 個だった。

C/RTL 協調シミュレーションを行った。

Latency は 795 クロックだった。

C/RTL 協調シミュレーションの波形を示す。

制御信号がほとんど一直線で良さそうだ。

Export RTL を行った。なお、Vivado synthesis, place and route にチェックを入れている。

CP achieved post-implementation も 4.348 ns で問題ない。

2018年08月29日

Duplicate IP の製作1（ソースコード）

SqueezeNet4mnist をVivado HLSのテンプレートで実装しようと思ったときに、1つのHLSストリームを 2 つのHLSストリームに分ける Duplicate IP と 2 つのHLSストリームを 1 つのHLSストリームにする Concatenate IP が必要となる。
今回は、Duplicate IP を作ってみよう。

まずは、duplicate_template.h を貼っておく。

// duplicate_template.h
// 2018/08/23 by marsee
//

#ifndef __DUPLICATE_TEMPLATE_H___
#define __DUPLICATE_TEMPLATE_H___

#include <ap_int.h>
#include <hls_stream.h>
#include <ap_axi_sdata.h>
#include <hls_video.h>
#include <ap_fixed.h>

#include "layer_general.h"

#define TO_LITERAL(x) #x
#define PRAGMA_HLS(tok) _Pragma(TO_LITERAL(HLS tok)) // @hiyuhさんから

template<
    const size_t IN_W,     // 入力のビット幅、2つの入力のビット幅と小数点位置は合わせる
    const size_t IN_I,     // 入力の小数点位置
    const size_t NUMBER_OF_IN_CHANNELS,
    const size_t HORIZONTAL_PIXEL_WIDTH,
    const size_t VERTICAL_PIXEL_WIDTH
>int duplicate_template(hls::stream<ap_fixed_axis<IN_W,IN_I,NUMBER_OF_IN_CHANNELS,1> >&ins,
    hls::stream<ap_fixed_axis<IN_W,IN_I,NUMBER_OF_IN_CHANNELS,1> >&outs0,
    hls::stream<ap_fixed_axis<IN_W,IN_I,NUMBER_OF_IN_CHANNELS,1> >&outs1
){
    ap_fixed_axis<IN_W,IN_I,NUMBER_OF_IN_CHANNELS,1> in;
    ap_fixed_axis<IN_W,IN_I,NUMBER_OF_IN_CHANNELS,1> out;

    Loop1 : do { // user が 1になった時にスタートする
#pragma HLS PIPELINE II=1
#pragma HLS LOOP_TRIPCOUNT min=1 max=1 avg=1
        ins >> in;
    } while(in.user == 0);

    Loop_y : for(int y=0; y<VERTICAL_PIXEL_WIDTH; y++){
        Loop_x : for(int x=0; x<HORIZONTAL_PIXEL_WIDTH; x++){
#pragma HLS PIPELINE II=1
            if(!(x==0 && y==0)){
                ins >> in;
            }

            Loop_cp : for(int i=0; i<NUMBER_OF_IN_CHANNELS; i++){
                if(i<NUMBER_OF_IN_CHANNELS){
                    out.data[i] = in.data[i];
                }
            }

            out.user = in.user;
            out.last = in.last;

            outs0 << out;
            outs1 << out;
        }
    }
    return(0);
}

#endif

HLSストリームを 2 つのHLSストリームに分けるだけの記述になっている。

次に、duplicate_template.h をインスタンスする duplicate1.cpp を貼っておく。

// duplicate1.cpp
// 2018/08/23 by marsee
//

#include "duplicate_template.h"

int duplicate1(hls::stream<ap_fixed_axis<16,6,2,1> >& ins,
        hls::stream<ap_fixed_axis<16,6,2,1> >& outs0,
        hls::stream<ap_fixed_axis<16,6,2,1> >& outs1){
    return(duplicate_template<16,6,2,28,28>(ins, outs0, outs1));
}

最後にテストベンチの duplicate1_tb.cpp を貼っておく。

// duplicate1_tb.cpp
// 2018/08/23 by marsee
//

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>
#include <ap_int.h>
#include <hls_stream.h>
#include <iostream>
#include <fstream>
#include <iomanip>
#include <math.h>
#include <ap_axi_sdata.h>
#include <hls_video.h>

#include "layer_general.h"

static const size_t NW = 16;
static const size_t NI = 6;
static const size_t NUMBER_OF_KERNEL = 2;
static const size_t HORIZONTAL_PIXEL_WIDTH = 28;
static const size_t VERTICAL_PIXEL_WIDTH = 28;

int duplicate1(hls::stream<ap_fixed_axis<NW,NI,NUMBER_OF_KERNEL,1> >& ins,
        hls::stream<ap_fixed_axis<NW,NI,NUMBER_OF_KERNEL,1> >& outs0,
        hls::stream<ap_fixed_axis<NW,NI,NUMBER_OF_KERNEL,1> >& outs1);

int main(){
    using namespace std;

    hls::stream<ap_fixed_axis<NW,NI,NUMBER_OF_KERNEL,1> > ins;
    hls::stream<ap_fixed_axis<NW,NI,NUMBER_OF_KERNEL,1> > outs0;
    hls::stream<ap_fixed_axis<NW,NI,NUMBER_OF_KERNEL,1> > outs1;
    ap_fixed_axis<NW,NI,NUMBER_OF_KERNEL,1> in;
    ap_fixed_axis<NW,NI,NUMBER_OF_KERNEL,1> out0;
    ap_fixed_axis<NW,NI,NUMBER_OF_KERNEL,1> out1;

    typedef ap_fixed<NW,NI,AP_TRN,AP_WRAP> ap_fixed_type;

    // ins に入力データを用意する
    for(int i=0; i<5; i++){    //　dummy data
        in.user = 0;
        for(int k=0; k<NUMBER_OF_KERNEL; k++){
            in.data[k] = ap_fixed_type(float(k+i)/100.0);
        }
        ins << in;
    }

    // 1 画面分のデータを insに入力する
    for(int j=0; j < VERTICAL_PIXEL_WIDTH; j++){
        for(int i=0; i < HORIZONTAL_PIXEL_WIDTH; i++){
            for(int k=0; k<NUMBER_OF_KERNEL; k++){
                in.data[k] = ap_fixed_type((float)(k+HORIZONTAL_PIXEL_WIDTH*j+i)/100.0);
            }

            if (j==0 && i==0){    // 最初のデータの時に TUSER を 1 にする
                in.user = 1;
            } else {
                in.user = 0;
            }

            if (i == HORIZONTAL_PIXEL_WIDTH-1){ // 行の最後でTLASTをアサートする
                in.last = 1;
            } else {
                in.last = 0;
            }
            ins << in;
        }
    }

    duplicate1(ins, outs0, outs1);

    // outs0　と outs1 を比較する
    int error = 0;
    for(int j=0; j < VERTICAL_PIXEL_WIDTH; j++){
        for(int i=0; i < HORIZONTAL_PIXEL_WIDTH; i++){
            outs0 >> out0;
            outs1 >> out1;
            for(int k=0; k<NUMBER_OF_KERNEL; k++){
                if(out0.data[k] != out1.data[k]){
                    printf("Error : out0.data[%d] = %f, out1.data[%d] = %f\n", k, float(out0.data[k]), k, float(out1.data[k]));
                    error = 1;
                }
            }

            if(j==0 && i==0){
                if(out0.user==1 && out1.user==1) ;
                else{
                    printf("j==0 && i==0 : out0.user = %d, out1.user = %d\n", out0.user, out1.user);
                    error = 1;
                }
            } else {
                if(out0.user==0 && out1.user==0) ;
                else{
                    printf("j!=0 || i!=0 : out0.user = %d, out1.user = %d\n", out0.user, out1.user);
                    error = 1;
                }
            }

            if(i == HORIZONTAL_PIXEL_WIDTH-1){
                if(out0.last==1 && out1.last==1) ;
                else{
                    printf("Horizontal last data : out0.last = %d, out1.last = %d\n", out0.last, out1.last);
                    error = 1;
                }
            }
        }
    }
    if(error == 0)
        printf("No Error\n");

    return(0);
}

テストベンチはHLSストリームのデータを用意して duplicate1 を呼び出して、出力された 2 つのHLSストリームの outs0, outs1 を比べて同一かどうか？をテストするテストベンチとなっている。

2018年08月28日

SqueezeNet for MNIST 3（層の統計情報とC ヘッダ・ファイルへの出力）

”SqueezeNet for MNIST 2”の続き。

前回は、model accuracy と model loss を示し、model.summary() を示した。今回は、層の統計情報を取って、各層の重みとバイアスのC ヘッダ・ファイルを出力する。

畳み込み層の重みをCヘッダファイルに書き出すPython コードは”TensorFlow + Kerasを使ってみた9（畳み込み層の重みをC のヘッダに変換）”のPython コードを使用した。
畳み込み層バイアスをCヘッダファイルに書き出すPython コードは”TensorFlow + Kerasを使ってみた10（バイアスをC のヘッダに変換）”のPython コードを使用した。

それから、層の統計情報出力し、重みとバイアスをC ヘッダ・ファイルに出力した。

Python コードを貼っておく。

# Convolution layerの中間出力を取り出す
from keras.models import Model
import numpy as np

for num in range(1, 27):
    conv_layer_name = 'conv2d_' + str(num)

    conv_layer = model.get_layer(conv_layer_name)
    conv_layer_wb = conv_layer.get_weights()

    conv_layer_model = Model(inputs=model.input,
                                     outputs=model.get_layer(conv_layer_name).output)
    conv_output = conv_layer_model.predict(x_test, verbose=1)

    conv_layer_weight = conv_layer_wb[0]
    conv_layer_bias = conv_layer_wb[1]

    print(conv_layer_name)
    print(conv_layer_weight.shape)
    print(conv_layer_weight.transpose(3,2,0,1).shape)
    print(conv_layer_bias.shape)
    print(conv_output.shape)

    print("np.max(conv_layer_weight) = {0}".format(np.max(conv_layer_weight)))
    print("np.min(conv_layer_weight) = {0}".format(np.min(conv_layer_weight)))
    abs_conv_layer_weight = np.absolute(conv_layer_weight)
    print("np.max(abs_conv_layer_weight) = {0}".format(np.max(abs_conv_layer_weight)))
    print("np.min(abs_conv_layer_weight) = {0}".format(np.min(abs_conv_layer_weight)))

    print("np.max(conv_layer_bias) = {0}".format(np.max(conv_layer_bias)))
    print("np.min(conv_layer_bias) = {0}".format(np.min(conv_layer_bias)))
    abs_conv_layer_bias = np.absolute(conv_layer_bias)
    print("np.max(abs_conv_layer_bias) = {0}".format(np.max(abs_conv_layer_bias)))
    print("np.min(abs_conv_layer_bias) = {0}".format(np.min(abs_conv_layer_bias)))

    print("conv_output = {0}".format(conv_output.shape))
    print("np.std(conv_output) = {0}".format(np.std(conv_output)))
    print("np.max(conv_output) = {0}".format(np.max(conv_output)))
    print("np.min(conv_output) = {0}".format(np.min(conv_output)))

    abs_conv_output = np.absolute(conv_output)
    print("np.max(abs_conv) = {0}".format(np.max(abs_conv_output)))
    print("np.min(abs_conv) = {0}".format(np.min(abs_conv_output)))
    print("")

    # 2018/06/05 修正　畳み込み層の重みの配列は(カーネルサイズh,カーネルサイズw, 入力チャネル, 出力チャネル)ということなので、Pythonコードを修正した。@NORA__0013 さんありがとうございました。

    MAGNIFICATION_CONV = 2 ** (9-1)
    fwrite_conv_weight(conv_layer_weight.transpose(3,2,0,1), 'conv'+str(num)+'_weight.h', 'conv'+str(num)+'_fweight', 'conv'+str(num)+'_weight', MAGNIFICATION_CONV)

    fwrite_bias(conv_layer_bias, 'conv'+str(num)+'_bias.h', 'conv'+str(num)+'_fbias', 'conv'+str(num)+'_bias', MAGNIFICATION_CONV)

生成された重みとバイアスを示す。

全統計情報を示す。

10000/10000 [==============================] - 1s 137us/step
conv2d_1
(3, 3, 1, 96)
(96, 1, 3, 3)
(96,)
(10000, 13, 13, 96)
np.max(conv_layer_weight) = 0.23241539299488068
np.min(conv_layer_weight) = -0.2830057442188263
np.max(abs_conv_layer_weight) = 0.2830057442188263
np.min(abs_conv_layer_weight) = 0.00018287629063706845
np.max(conv_layer_bias) = 0.19594410061836243
np.min(conv_layer_bias) = -0.06709111481904984
np.max(abs_conv_layer_bias) = 0.19594410061836243
np.min(abs_conv_layer_bias) = 0.0003071741375606507
conv_output = (10000, 13, 13, 96)
np.std(conv_output) = 0.1008128821849823
np.max(conv_output) = 0.657059371471405
np.min(conv_output) = -0.7351041436195374
np.max(abs_conv) = 0.7351041436195374
np.min(abs_conv) = 1.3969838619232178e-09

10000/10000 [==============================] - 1s 84us/step
conv2d_2
(1, 1, 96, 16)
(16, 96, 1, 1)
(16,)
(10000, 6, 6, 16)
np.max(conv_layer_weight) = 0.41374555230140686
np.min(conv_layer_weight) = -0.4286271035671234
np.max(abs_conv_layer_weight) = 0.4286271035671234
np.min(abs_conv_layer_weight) = 0.00025720358826220036
np.max(conv_layer_bias) = 0.07645587623119354
np.min(conv_layer_bias) = -0.04433523118495941
np.max(abs_conv_layer_bias) = 0.07645587623119354
np.min(abs_conv_layer_bias) = 0.0009641082142479718
conv_output = (10000, 6, 6, 16)
np.std(conv_output) = 0.29928353428840637
np.max(conv_output) = 1.535873293876648
np.min(conv_output) = -1.057706356048584
np.max(abs_conv) = 1.535873293876648
np.min(abs_conv) = 1.825392246246338e-07

10000/10000 [==============================] - 1s 111us/step
conv2d_3
(1, 1, 16, 64)
(64, 16, 1, 1)
(64,)
(10000, 6, 6, 64)
np.max(conv_layer_weight) = 0.3813319504261017
np.min(conv_layer_weight) = -0.3624926209449768
np.max(abs_conv_layer_weight) = 0.3813319504261017
np.min(abs_conv_layer_weight) = 0.0003253854811191559
np.max(conv_layer_bias) = 0.08430161327123642
np.min(conv_layer_bias) = -0.06795598566532135
np.max(abs_conv_layer_bias) = 0.08430161327123642
np.min(abs_conv_layer_bias) = 0.0005309522384777665
conv_output = (10000, 6, 6, 64)
np.std(conv_output) = 0.19837331771850586
np.max(conv_output) = 1.1776185035705566
np.min(conv_output) = -0.9223350882530212
np.max(abs_conv) = 1.1776185035705566
np.min(abs_conv) = 7.450580596923828e-09

10000/10000 [==============================] - 1s 107us/step
conv2d_4
(3, 3, 16, 64)
(64, 16, 3, 3)
(64,)
(10000, 6, 6, 64)
np.max(conv_layer_weight) = 0.23302672803401947
np.min(conv_layer_weight) = -0.23958751559257507
np.max(abs_conv_layer_weight) = 0.23958751559257507
np.min(abs_conv_layer_weight) = 3.953342002205318e-06
np.max(conv_layer_bias) = 0.09260329604148865
np.min(conv_layer_bias) = -0.07423409074544907
np.max(abs_conv_layer_bias) = 0.09260329604148865
np.min(abs_conv_layer_bias) = 4.278140841051936e-05
conv_output = (10000, 6, 6, 64)
np.std(conv_output) = 0.3154037594795227
np.max(conv_output) = 1.8085815906524658
np.min(conv_output) = -2.099088191986084
np.max(abs_conv) = 2.099088191986084
np.min(abs_conv) = 2.2351741790771484e-08

10000/10000 [==============================] - 1s 109us/step
conv2d_5
(1, 1, 128, 16)
(16, 128, 1, 1)
(16,)
(10000, 6, 6, 16)
np.max(conv_layer_weight) = 0.3283624053001404
np.min(conv_layer_weight) = -0.32311302423477173
np.max(abs_conv_layer_weight) = 0.3283624053001404
np.min(abs_conv_layer_weight) = 9.099296585191041e-05
np.max(conv_layer_bias) = 0.11211202293634415
np.min(conv_layer_bias) = -0.023712079972028732
np.max(abs_conv_layer_bias) = 0.11211202293634415
np.min(abs_conv_layer_bias) = 0.00019536164472810924
conv_output = (10000, 6, 6, 16)
np.std(conv_output) = 0.39457967877388
np.max(conv_output) = 3.2098631858825684
np.min(conv_output) = -1.4322690963745117
np.max(abs_conv) = 3.2098631858825684
np.min(abs_conv) = 2.9802322387695312e-08

10000/10000 [==============================] - 1s 116us/step
conv2d_6
(1, 1, 16, 64)
(64, 16, 1, 1)
(64,)
(10000, 6, 6, 64)
np.max(conv_layer_weight) = 0.3472273051738739
np.min(conv_layer_weight) = -0.3461221754550934
np.max(abs_conv_layer_weight) = 0.3472273051738739
np.min(abs_conv_layer_weight) = 2.151904845959507e-05
np.max(conv_layer_bias) = 0.11182372272014618
np.min(conv_layer_bias) = -0.07308819890022278
np.max(abs_conv_layer_bias) = 0.11182372272014618
np.min(abs_conv_layer_bias) = 9.089573723031208e-05
conv_output = (10000, 6, 6, 64)
np.std(conv_output) = 0.28788891434669495
np.max(conv_output) = 2.15021014213562
np.min(conv_output) = -1.765142560005188
np.max(abs_conv) = 2.15021014213562
np.min(abs_conv) = 1.4901161193847656e-08

10000/10000 [==============================] - 1s 134us/step
conv2d_7
(3, 3, 16, 64)
(64, 16, 3, 3)
(64,)
(10000, 6, 6, 64)
np.max(conv_layer_weight) = 0.25860607624053955
np.min(conv_layer_weight) = -0.22963279485702515
np.max(abs_conv_layer_weight) = 0.25860607624053955
np.min(abs_conv_layer_weight) = 5.8508041547611356e-05
np.max(conv_layer_bias) = 0.08639949560165405
np.min(conv_layer_bias) = -0.08458155393600464
np.max(abs_conv_layer_bias) = 0.08639949560165405
np.min(abs_conv_layer_bias) = 8.820320363156497e-05
conv_output = (10000, 6, 6, 64)
np.std(conv_output) = 0.4426092207431793
np.max(conv_output) = 2.844907522201538
np.min(conv_output) = -2.9819791316986084
np.max(abs_conv) = 2.9819791316986084
np.min(abs_conv) = 1.1175870895385742e-08

10000/10000 [==============================] - 1s 142us/step
conv2d_8
(1, 1, 128, 32)
(32, 128, 1, 1)
(32,)
(10000, 6, 6, 32)
np.max(conv_layer_weight) = 0.3010956645011902
np.min(conv_layer_weight) = -0.31097307801246643
np.max(abs_conv_layer_weight) = 0.31097307801246643
np.min(abs_conv_layer_weight) = 4.14710957556963e-05
np.max(conv_layer_bias) = 0.11694670468568802
np.min(conv_layer_bias) = -0.049082059413194656
np.max(abs_conv_layer_bias) = 0.11694670468568802
np.min(abs_conv_layer_bias) = 0.00034066737862303853
conv_output = (10000, 6, 6, 32)
np.std(conv_output) = 0.4840845465660095
np.max(conv_output) = 2.9063262939453125
np.min(conv_output) = -3.1062679290771484
np.max(abs_conv) = 3.1062679290771484
np.min(abs_conv) = 7.182825356721878e-08

10000/10000 [==============================] - 2s 157us/step
conv2d_9
(1, 1, 32, 128)
(128, 32, 1, 1)
(128,)
(10000, 6, 6, 128)
np.max(conv_layer_weight) = 0.2580183744430542
np.min(conv_layer_weight) = -0.2867690324783325
np.max(abs_conv_layer_weight) = 0.2867690324783325
np.min(abs_conv_layer_weight) = 5.197449354454875e-06
np.max(conv_layer_bias) = 0.08810567855834961
np.min(conv_layer_bias) = -0.10537270456552505
np.max(abs_conv_layer_bias) = 0.10537270456552505
np.min(abs_conv_layer_bias) = 0.0009785539004951715
conv_output = (10000, 6, 6, 128)
np.std(conv_output) = 0.21309173107147217
np.max(conv_output) = 1.4621883630752563
np.min(conv_output) = -1.9276840686798096
np.max(abs_conv) = 1.9276840686798096
np.min(abs_conv) = 1.862645149230957e-09

10000/10000 [==============================] - 2s 198us/step
conv2d_10
(3, 3, 32, 128)
(128, 32, 3, 3)
(128,)
(10000, 6, 6, 128)
np.max(conv_layer_weight) = 0.24369853734970093
np.min(conv_layer_weight) = -0.23873521387577057
np.max(abs_conv_layer_weight) = 0.24369853734970093
np.min(abs_conv_layer_weight) = 2.737535396590829e-06
np.max(conv_layer_bias) = 0.06724698841571808
np.min(conv_layer_bias) = -0.08158554136753082
np.max(abs_conv_layer_bias) = 0.08158554136753082
np.min(abs_conv_layer_bias) = 0.00012395322846714407
conv_output = (10000, 6, 6, 128)
np.std(conv_output) = 0.4583068788051605
np.max(conv_output) = 3.0841262340545654
np.min(conv_output) = -3.095609426498413
np.max(abs_conv) = 3.095609426498413
np.min(abs_conv) = 5.587935447692871e-09

10000/10000 [==============================] - 2s 208us/step
conv2d_11
(1, 1, 256, 32)
(32, 256, 1, 1)
(32,)
(10000, 3, 3, 32)
np.max(conv_layer_weight) = 0.2613295018672943
np.min(conv_layer_weight) = -0.2657088041305542
np.max(abs_conv_layer_weight) = 0.2657088041305542
np.min(abs_conv_layer_weight) = 6.479893636424094e-05
np.max(conv_layer_bias) = 0.08481862396001816
np.min(conv_layer_bias) = -0.04361288622021675
np.max(abs_conv_layer_bias) = 0.08481862396001816
np.min(abs_conv_layer_bias) = 0.0018431995995342731
conv_output = (10000, 3, 3, 32)
np.std(conv_output) = 0.6534223556518555
np.max(conv_output) = 4.2715630531311035
np.min(conv_output) = -3.003734827041626
np.max(abs_conv) = 4.2715630531311035
np.min(abs_conv) = 6.174668669700623e-07

10000/10000 [==============================] - 2s 233us/step
conv2d_12
(1, 1, 32, 128)
(128, 32, 1, 1)
(128,)
(10000, 3, 3, 128)
np.max(conv_layer_weight) = 0.24686792492866516
np.min(conv_layer_weight) = -0.245181143283844
np.max(abs_conv_layer_weight) = 0.24686792492866516
np.min(abs_conv_layer_weight) = 0.00010391152318334207
np.max(conv_layer_bias) = 0.1275489628314972
np.min(conv_layer_bias) = -0.09533561766147614
np.max(abs_conv_layer_bias) = 0.1275489628314972
np.min(abs_conv_layer_bias) = 0.0002949666231870651
conv_output = (10000, 3, 3, 128)
np.std(conv_output) = 0.30596160888671875
np.max(conv_output) = 2.30812931060791
np.min(conv_output) = -1.7490240335464478
np.max(abs_conv) = 2.30812931060791
np.min(abs_conv) = 6.332993507385254e-08

10000/10000 [==============================] - 2s 231us/step
conv2d_13
(3, 3, 32, 128)
(128, 32, 3, 3)
(128,)
(10000, 3, 3, 128)
np.max(conv_layer_weight) = 0.16889545321464539
np.min(conv_layer_weight) = -0.1818806380033493
np.max(abs_conv_layer_weight) = 0.1818806380033493
np.min(abs_conv_layer_weight) = 5.960464477539063e-08
np.max(conv_layer_bias) = 0.10003684461116791
np.min(conv_layer_bias) = -0.07217587530612946
np.max(abs_conv_layer_bias) = 0.10003684461116791
np.min(abs_conv_layer_bias) = 6.108602974563837e-06
conv_output = (10000, 3, 3, 128)
np.std(conv_output) = 0.38053178787231445
np.max(conv_output) = 2.981738805770874
np.min(conv_output) = -2.0631744861602783
np.max(abs_conv) = 2.981738805770874
np.min(abs_conv) = 4.470348358154297e-08

10000/10000 [==============================] - 2s 229us/step
conv2d_14
(1, 1, 256, 48)
(48, 256, 1, 1)
(48,)
(10000, 3, 3, 48)
np.max(conv_layer_weight) = 0.22013017535209656
np.min(conv_layer_weight) = -0.2426384836435318
np.max(abs_conv_layer_weight) = 0.2426384836435318
np.min(abs_conv_layer_weight) = 5.409237928688526e-06
np.max(conv_layer_bias) = 0.127075657248497
np.min(conv_layer_bias) = -0.07448185980319977
np.max(abs_conv_layer_bias) = 0.127075657248497
np.min(abs_conv_layer_bias) = 6.410764763131738e-05
conv_output = (10000, 3, 3, 48)
np.std(conv_output) = 0.4870619773864746
np.max(conv_output) = 4.841851711273193
np.min(conv_output) = -3.36439847946167
np.max(abs_conv) = 4.841851711273193
np.min(abs_conv) = 5.21540641784668e-07

10000/10000 [==============================] - 2s 246us/step
conv2d_15
(1, 1, 48, 192)
(192, 48, 1, 1)
(192,)
(10000, 3, 3, 192)
np.max(conv_layer_weight) = 0.2446354627609253
np.min(conv_layer_weight) = -0.212723970413208
np.max(abs_conv_layer_weight) = 0.2446354627609253
np.min(abs_conv_layer_weight) = 1.650952617637813e-05
np.max(conv_layer_bias) = 0.10810786485671997
np.min(conv_layer_bias) = -0.11747808754444122
np.max(abs_conv_layer_bias) = 0.11747808754444122
np.min(abs_conv_layer_bias) = 0.0001598717353772372
conv_output = (10000, 3, 3, 192)
np.std(conv_output) = 0.2135731279850006
np.max(conv_output) = 2.3861021995544434
np.min(conv_output) = -1.382702112197876
np.max(abs_conv) = 2.3861021995544434
np.min(abs_conv) = 7.450580596923828e-09

10000/10000 [==============================] - 3s 274us/step
conv2d_16
(3, 3, 48, 192)
(192, 48, 3, 3)
(192,)
(10000, 3, 3, 192)
np.max(conv_layer_weight) = 0.1449326127767563
np.min(conv_layer_weight) = -0.13954271376132965
np.max(abs_conv_layer_weight) = 0.1449326127767563
np.min(abs_conv_layer_weight) = 4.5239175960887223e-08
np.max(conv_layer_bias) = 0.0911775752902031
np.min(conv_layer_bias) = -0.07379891723394394
np.max(abs_conv_layer_bias) = 0.0911775752902031
np.min(abs_conv_layer_bias) = 7.930440187919885e-05
conv_output = (10000, 3, 3, 192)
np.std(conv_output) = 0.26481249928474426
np.max(conv_output) = 2.032439708709717
np.min(conv_output) = -2.2265918254852295
np.max(abs_conv) = 2.2265918254852295
np.min(abs_conv) = 7.450580596923828e-09

10000/10000 [==============================] - 3s 278us/step
conv2d_17
(1, 1, 384, 48)
(48, 384, 1, 1)
(48,)
(10000, 3, 3, 48)
np.max(conv_layer_weight) = 0.18762683868408203
np.min(conv_layer_weight) = -0.17972998321056366
np.max(abs_conv_layer_weight) = 0.18762683868408203
np.min(abs_conv_layer_weight) = 2.5248411475331523e-05
np.max(conv_layer_bias) = 0.1976175457239151
np.min(conv_layer_bias) = -0.056839194148778915
np.max(abs_conv_layer_bias) = 0.1976175457239151
np.min(abs_conv_layer_bias) = 8.029816672205925e-05
conv_output = (10000, 3, 3, 48)
np.std(conv_output) = 0.4208807051181793
np.max(conv_output) = 4.762706756591797
np.min(conv_output) = -2.9888551235198975
np.max(abs_conv) = 4.762706756591797
np.min(abs_conv) = 7.450580596923828e-09

10000/10000 [==============================] - 3s 304us/step
conv2d_18
(1, 1, 48, 192)
(192, 48, 1, 1)
(192,)
(10000, 3, 3, 192)
np.max(conv_layer_weight) = 0.2371293008327484
np.min(conv_layer_weight) = -0.2002277970314026
np.max(abs_conv_layer_weight) = 0.2371293008327484
np.min(abs_conv_layer_weight) = 1.3661010598298162e-06
np.max(conv_layer_bias) = 0.11516989022493362
np.min(conv_layer_bias) = -0.11722267419099808
np.max(abs_conv_layer_bias) = 0.11722267419099808
np.min(abs_conv_layer_bias) = 0.0005882936529815197
conv_output = (10000, 3, 3, 192)
np.std(conv_output) = 0.18415993452072144
np.max(conv_output) = 1.7632944583892822
np.min(conv_output) = -1.28092360496521
np.max(abs_conv) = 1.7632944583892822
np.min(abs_conv) = 1.1175870895385742e-08

10000/10000 [==============================] - 3s 308us/step
conv2d_19
(3, 3, 48, 192)
(192, 48, 3, 3)
(192,)
(10000, 3, 3, 192)
np.max(conv_layer_weight) = 0.12958909571170807
np.min(conv_layer_weight) = -0.15907122194766998
np.max(abs_conv_layer_weight) = 0.15907122194766998
np.min(abs_conv_layer_weight) = 1.0321384280587154e-07
np.max(conv_layer_bias) = 0.11025165766477585
np.min(conv_layer_bias) = -0.10412546247243881
np.max(abs_conv_layer_bias) = 0.11025165766477585
np.min(abs_conv_layer_bias) = 0.0005223090411163867
conv_output = (10000, 3, 3, 192)
np.std(conv_output) = 0.2775069773197174
np.max(conv_output) = 2.5127899646759033
np.min(conv_output) = -2.445796489715576
np.max(abs_conv) = 2.5127899646759033
np.min(abs_conv) = 3.725290298461914e-08

10000/10000 [==============================] - 3s 323us/step
conv2d_20
(1, 1, 384, 64)
(64, 384, 1, 1)
(64,)
(10000, 3, 3, 64)
np.max(conv_layer_weight) = 0.17044414579868317
np.min(conv_layer_weight) = -0.1799185425043106
np.max(abs_conv_layer_weight) = 0.1799185425043106
np.min(abs_conv_layer_weight) = 1.3735225365962833e-06
np.max(conv_layer_bias) = 0.11707722395658493
np.min(conv_layer_bias) = -0.07737477868795395
np.max(abs_conv_layer_bias) = 0.11707722395658493
np.min(abs_conv_layer_bias) = 0.0006104373605921865
conv_output = (10000, 3, 3, 64)
np.std(conv_output) = 0.373676598072052
np.max(conv_output) = 3.9217896461486816
np.min(conv_output) = -2.750190496444702
np.max(abs_conv) = 3.9217896461486816
np.min(abs_conv) = 2.0116567611694336e-07

10000/10000 [==============================] - 3s 333us/step
conv2d_21
(1, 1, 64, 256)
(256, 64, 1, 1)
(256,)
(10000, 3, 3, 256)
np.max(conv_layer_weight) = 0.18047918379306793
np.min(conv_layer_weight) = -0.17668132483959198
np.max(abs_conv_layer_weight) = 0.18047918379306793
np.min(abs_conv_layer_weight) = 1.4669767551822588e-06
np.max(conv_layer_bias) = 0.04918604716658592
np.min(conv_layer_bias) = -0.10770349204540253
np.max(abs_conv_layer_bias) = 0.10770349204540253
np.min(abs_conv_layer_bias) = 0.00023276149295270443
conv_output = (10000, 3, 3, 256)
np.std(conv_output) = 0.1407841295003891
np.max(conv_output) = 1.3393429517745972
np.min(conv_output) = -1.2819019556045532
np.max(abs_conv) = 1.3393429517745972
np.min(abs_conv) = 1.862645149230957e-09

10000/10000 [==============================] - 4s 381us/step
conv2d_22
(3, 3, 64, 256)
(256, 64, 3, 3)
(256,)
(10000, 3, 3, 256)
np.max(conv_layer_weight) = 0.16038639843463898
np.min(conv_layer_weight) = -0.15026776492595673
np.max(abs_conv_layer_weight) = 0.16038639843463898
np.min(abs_conv_layer_weight) = 1.719431566016283e-07
np.max(conv_layer_bias) = 0.05694166570901871
np.min(conv_layer_bias) = -0.07688959687948227
np.max(abs_conv_layer_bias) = 0.07688959687948227
np.min(abs_conv_layer_bias) = 0.0001346934586763382
conv_output = (10000, 3, 3, 256)
np.std(conv_output) = 0.31745824217796326
np.max(conv_output) = 3.5984885692596436
np.min(conv_output) = -2.2447152137756348
np.max(abs_conv) = 3.5984885692596436
np.min(abs_conv) = 7.334165275096893e-08

10000/10000 [==============================] - 4s 384us/step
conv2d_23
(1, 1, 512, 64)
(64, 512, 1, 1)
(64,)
(10000, 1, 1, 64)
np.max(conv_layer_weight) = 0.19756470620632172
np.min(conv_layer_weight) = -0.19377626478672028
np.max(abs_conv_layer_weight) = 0.19756470620632172
np.min(abs_conv_layer_weight) = 1.454010543966433e-06
np.max(conv_layer_bias) = 0.08133905380964279
np.min(conv_layer_bias) = -0.0322188101708889
np.max(abs_conv_layer_bias) = 0.08133905380964279
np.min(abs_conv_layer_bias) = 0.001015845569781959
conv_output = (10000, 1, 1, 64)
np.std(conv_output) = 0.9502997398376465
np.max(conv_output) = 6.56773042678833
np.min(conv_output) = -3.0976979732513428
np.max(abs_conv) = 6.56773042678833
np.min(abs_conv) = 6.351619958877563e-07

10000/10000 [==============================] - 4s 392us/step
conv2d_24
(1, 1, 64, 256)
(256, 64, 1, 1)
(256,)
(10000, 1, 1, 256)
np.max(conv_layer_weight) = 0.2581574618816376
np.min(conv_layer_weight) = -0.21347056329250336
np.max(abs_conv_layer_weight) = 0.2581574618816376
np.min(abs_conv_layer_weight) = 7.390540758933639e-06
np.max(conv_layer_bias) = 0.09020012617111206
np.min(conv_layer_bias) = -0.06578623503446579
np.max(abs_conv_layer_bias) = 0.09020012617111206
np.min(abs_conv_layer_bias) = 7.086795812938362e-05
conv_output = (10000, 1, 1, 256)
np.std(conv_output) = 0.649179220199585
np.max(conv_output) = 3.7935266494750977
np.min(conv_output) = -3.0759589672088623
np.max(abs_conv) = 3.7935266494750977
np.min(abs_conv) = 6.146728992462158e-08

10000/10000 [==============================] - 4s 406us/step
conv2d_25
(3, 3, 64, 256)
(256, 64, 3, 3)
(256,)
(10000, 1, 1, 256)
np.max(conv_layer_weight) = 0.17148838937282562
np.min(conv_layer_weight) = -0.13356263935565948
np.max(abs_conv_layer_weight) = 0.17148838937282562
np.min(abs_conv_layer_weight) = 1.862645149230957e-07
np.max(conv_layer_bias) = 0.11338628083467484
np.min(conv_layer_bias) = -0.05885875225067139
np.max(abs_conv_layer_bias) = 0.11338628083467484
np.min(abs_conv_layer_bias) = 2.0896652131341398e-05
conv_output = (10000, 1, 1, 256)
np.std(conv_output) = 0.4251912832260132
np.max(conv_output) = 2.2752349376678467
np.min(conv_output) = -1.4251927137374878
np.max(abs_conv) = 2.2752349376678467
np.min(abs_conv) = 2.4959444999694824e-07

10000/10000 [==============================] - 4s 416us/step
conv2d_26
(1, 1, 512, 10)
(10, 512, 1, 1)
(10,)
(10000, 1, 1, 10)
np.max(conv_layer_weight) = 0.2454858422279358
np.min(conv_layer_weight) = -0.24298278987407684
np.max(abs_conv_layer_weight) = 0.2454858422279358
np.min(abs_conv_layer_weight) = 5.791342118754983e-06
np.max(conv_layer_bias) = 0.05678941309452057
np.min(conv_layer_bias) = -0.04499662294983864
np.max(abs_conv_layer_bias) = 0.05678941309452057
np.min(abs_conv_layer_bias) = 0.001525017200037837
conv_output = (10000, 1, 1, 10)
np.std(conv_output) = 4.045248031616211
np.max(conv_output) = 13.264505386352539
np.min(conv_output) = -17.079362869262695
np.max(abs_conv) = 17.079362869262695
np.min(abs_conv) = 6.483122706413269e-05

2018年08月27日

ZYBOt のコースの写真撮影用アプリケーションソフトの開発

ZYBOｔの白線間走行テストを”白線追従用CNNを使用したZYBOtの白線追従走行3（走行テスト）”でやってみたが、あまりうまくコースを走ることができなかった。そこで、コースの写真を撮って再度学習を行うことにした。そのためには、写真をBMP ファイルにするアプリケーションソフトが必要だ。
”ZYBO Z7-20上のUbuntu 14.04でカメラ画像をBMPファイルに変換する”で cam_caputre_bmp.cpp を作ってある。このアプリの欠点は、BMP ファイルを作りまくるので画像表示ソフトで次の画像を表示しなくてはいけないところだ。良さそうなところを写真撮る場合に同じファイル名に書き続けて、画像表示ソフトが自動更新してくれると疑似的なストリーミング画像のようになるので、それをやってみたい。ただし、Micro SD カードが死んでしまうので、コマンドを入れたときにそうしたいと思う。
”ZYBO Z7-20上のUbuntu 14.04でカメラ画像をBMPファイルに変換する”の cam_capture_bmp.cpp をアップデートする。

アップデートした cam_capture_bmp.cpp を貼っておく。

//
// cam_capture_bmp.cpp
// 2016/08/19 by marsee
//
// This software converts the left and right of the camera image to BMP file.
// -b : bmp file name
// -n : Start File Number
// -h : help
//
// 2018/08/26 : Added 'e' command
//

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <assert.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <string.h>

#include "bmp_header.h"

#define PIXEL_NUM_OF_BYTES    4

#define SVGA_HORIZONTAL_PIXELS  800
#define SVGA_VERTICAL_LINES     600
#define SVGA_ALL_DISP_ADDRESS   (SVGA_HORIZONTAL_PIXELS * SVGA_VERTICAL_LINES * PIXEL_NUM_OF_BYTES)
#define SVGA_3_PICTURES         (SVGA_ALL_DISP_ADDRESS * NUMBER_OF_WRITE_FRAMES)

int WriteBMPfile(FILE *fbmp, volatile unsigned int *frame_buffer, BMP24FORMAT **bmp_data);

void cam_i2c_init(volatile unsigned *mt9d111_axi_iic) {
    mt9d111_axi_iic[64] = 0x2; // reset tx fifo ,address is 0x100, i2c_control_reg
    mt9d111_axi_iic[64] = 0x1; // enable i2c
}

void cam_i2x_write_sync(void) {
    // unsigned c;

    // c = *cam_i2c_rx_fifo;
    // while ((c & 0x84) != 0x80)
        // c = *cam_i2c_rx_fifo; // No Bus Busy and TX_FIFO_Empty = 1
    usleep(1000);
}

void cam_i2c_write(volatile unsigned *mt9d111_axi_iic, unsigned int device_addr, unsigned int write_addr, unsigned int write_data){
    mt9d111_axi_iic[66] = 0x100 | (device_addr & 0xfe); // Slave IIC Write Address, address is 0x108, i2c_tx_fifo
    mt9d111_axi_iic[66] = write_addr;
    mt9d111_axi_iic[66] = (write_data >> 8)|0xff;           // first data
    mt9d111_axi_iic[66] = 0x200 | (write_data & 0xff);      // second data
    cam_i2x_write_sync();
}

int main(int argc, char *argv[]){
    int opt;
    int c, help_flag=0;
    char bmp_fn[256] = "bmp_file";
    char  attr[1024];
    unsigned long  phys_addr;
    int i, j;
    int file_no = -1;
    FILE *fbmp;
    BMP24FORMAT **bmp_data; // 24 bits Date of BMP files (SVGA_HORIZONTAL_PIXELS * SVGA_VERTICAL_LINES)

    int fd0, fd1, fd2, fd3, fd4, fd5, fd6, fd7, fd8, fd9, fd10;
    volatile unsigned *bmdc_axi_lites0, *bmdc_axi_lites1;
    volatile unsigned *dmaw4gabor_0;
    volatile unsigned *axis_switch_0, *axis_switch_1;
    volatile unsigned *mt9d111_inf_axis_0;
    volatile unsigned *mt9d111_axi_iic;
    volatile unsigned *axi_gpio_0;
    volatile unsigned *frame_buffer_bmdc;

    while ((opt=getopt(argc, argv, "b:n:h")) != -1){
        switch (opt){
            case 'b':
                strcpy(bmp_fn, optarg);
                break;
            case 'n':
                file_no = atoi(optarg);
                break;
            case 'h':
                help_flag = 1;
                break;
        }
    }

    if (help_flag == 1){ // help
        printf("Usage : cam_capture_bmp [-b <bmp file name>] [-n <Start File Number>] [-h]\n");
        exit(0);
    }

    // Bitmap Display Controller 0 AXI4 Lite Slave (UIO6)
    fd6 = open("/dev/uio6", O_RDWR); // bitmap_display_controller 0 axi4 lite
    if (fd6 < 1){
        fprintf(stderr, "/dev/uio6 (bitmap_disp_cntrler_axi_master_0) open error\n");
        exit(-1);
    }
    bmdc_axi_lites0 = (volatile unsigned *)mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd6, 0);
    if (!bmdc_axi_lites0){
        fprintf(stderr, "bmdc_axi_lites0 mmap error\n");
        exit(-1);
    }

    // Bitmap Display Controller 1 AXI4 Lite Slave (UIO7)
    fd7 = open("/dev/uio7", O_RDWR); // bitmap_display_controller axi4 lite
    if (fd7 < 1){
        fprintf(stderr, "/dev/uio7 (bitmap_disp_cntrler_axi_master_0) open error\n");
        exit(-1);
    }
    bmdc_axi_lites1 = (volatile unsigned *)mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd7, 0);
    if (!bmdc_axi_lites1){
        fprintf(stderr, "bmdc_axi_lites1 mmap error\n");
        exit(-1);
    }

    // dmaw4gabor_0 (UIO1)
    fd1 = open("/dev/uio1", O_RDWR); // dmaw4gabor_0 interface AXI4 Lite Slave
    if (fd1 < 1){
        fprintf(stderr, "/dev/uio1 (dmaw4gabor_0) open error\n");
        exit(-1);
    }
    dmaw4gabor_0 = (volatile unsigned *)mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd1, 0);
    if (!dmaw4gabor_0){
        fprintf(stderr, "dmaw4gabor_0 mmap error\n");
        exit(-1);
    }

    // mt9d111 i2c AXI4 Lite Slave (UIO0)
    fd0 = open("/dev/uio0", O_RDWR); // mt9d111 i2c AXI4 Lite Slave
    if (fd0 < 1){
        fprintf(stderr, "/dev/uio0 (mt9d111_axi_iic) open error\n");
        exit(-1);
    }
    mt9d111_axi_iic = (volatile unsigned *)mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd0, 0);
    if (!mt9d111_axi_iic){
        fprintf(stderr, "mt9d111_axi_iic mmap error\n");
        exit(-1);
    }

    // mt9d111 inf axis AXI4 Lite Slave (UIO5)
    fd5 = open("/dev/uio5", O_RDWR); // mt9d111 inf axis AXI4 Lite Slave
    if (fd5 < 1){
        fprintf(stderr, "/dev/uio5 (mt9d111_inf_axis_0) open error\n");
        exit(-1);
    }
    mt9d111_inf_axis_0 = (volatile unsigned *)mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd5, 0);
    if (!mt9d111_inf_axis_0){
        fprintf(stderr, "mt9d111_inf_axis_0 mmap error\n");
        exit(-1);
    }

    // axis_switch_0 (UIO2)
    fd2 = open("/dev/uio2", O_RDWR); // axis_switch_0 interface AXI4 Lite Slave
    if (fd2 < 1){
        fprintf(stderr, "/dev/uio2 (axis_switch_0) open error\n");
        exit(-1);
    }
    axis_switch_0 = (volatile unsigned *)mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd2, 0);
    if (!axis_switch_0){
        fprintf(stderr, "axis_switch_0 mmap error\n");
        exit(-1);
    }

    // axis_switch_1 (UIO3)
    fd3 = open("/dev/uio3", O_RDWR); // axis_switch_1 interface AXI4 Lite Slave
    if (fd3 < 1){
        fprintf(stderr, "/dev/uio3 (axis_switch_1) open error\n");
        exit(-1);
    }
    axis_switch_1 = (volatile unsigned *)mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd3, 0);
    if (!axis_switch_1){
        fprintf(stderr, "axis_switch_1 mmap error\n");
        exit(-1);
    }

    // axi_gpio_0 (UIO8)
    fd8 = open("/dev/uio8", O_RDWR); // axi_gpio_0 interface AXI4 Lite Slave
    if (fd8 < 1){
        fprintf(stderr, "/dev/uio8 (axi_gpio_0) open error\n");
        exit(-1);
    }
    axi_gpio_0 = (volatile unsigned *)mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd8, 0);
    if (!axi_gpio_0){
        fprintf(stderr, "axi_gpio_8 mmap error\n");
        exit(-1);
    }

    // udmabuf0
    fd9 = open("/dev/udmabuf0", O_RDWR | O_SYNC); // frame_buffer, The chache is disabled.
    if (fd9 == -1){
        fprintf(stderr, "/dev/udmabuf0 open error\n");
        exit(-1);
    }
    frame_buffer_bmdc = (volatile unsigned *)mmap(NULL, 5760000, PROT_READ|PROT_WRITE, MAP_SHARED, fd9, 0);
    if (!frame_buffer_bmdc){
        fprintf(stderr, "frame_buffer_bmdc mmap error\n");
        exit(-1);
    }

    // axis_switch_1, 1to2 ,Select M00_AXIS
    // Refer to http://marsee101.blog19.fc2.com/blog-entry-3177.html
    axis_switch_1[16] = 0x0; // 0x40 = 0
    axis_switch_1[17] = 0x80000000; // 0x44 = 0x80000000, disable
    axis_switch_1[18] = 0x80000000; // 0x48 = 0x80000000, disable
    axis_switch_1[19] = 0x80000000; // 0x4C = 0x80000000, disable
    axis_switch_1[0] = 0x2; // Comit registers

    // axis_switch_0, 2to1, Select S00_AXIS
    // Refer to http://marsee101.blog19.fc2.com/blog-entry-3177.html
    axis_switch_0[16] = 0x0; // 0x40 = 0;
    axis_switch_0[0] = 0x2; // Comit registers

    // phys_addr of udmabuf0
    fd10 = open("/sys/class/udmabuf/udmabuf0/phys_addr", O_RDONLY);
    if (fd10 == -1){
        fprintf(stderr, "/sys/class/udmabuf/udmabuf0/phys_addr open error\n");
        exit(-1);
    }
    read(fd10, attr, 1024);
    sscanf(attr, "%lx", &phys_addr);
    close(fd10);
    printf("phys_addr = %x\n", (int)phys_addr);

    // DMAW4Gabor Initialization sequence
    dmaw4gabor_0[6] = (unsigned int)phys_addr; // Data signal of frame_buffer0
    dmaw4gabor_0[8] = (unsigned int)phys_addr+SVGA_ALL_DISP_ADDRESS; // Data signal of frame_buffer1
    dmaw4gabor_0[0] = 0x1; // ap_start = 1
    dmaw4gabor_0[0] = 0x80; // auto_restart = 1

    // bitmap display controller settings
    bmdc_axi_lites0[0] = (unsigned int)phys_addr; // Bitmap Display Controller 0 start
    bmdc_axi_lites1[0] = (unsigned int)phys_addr; // Bitmap Display Controller 1 start
    mt9d111_inf_axis_0[0] = (unsigned int)phys_addr; // Camera Interface start (Address is dummy)

    // CMOS Camera initialize, MT9D111
    cam_i2c_init(mt9d111_axi_iic);

    cam_i2c_write(mt9d111_axi_iic, 0xba, 0xf0, 0x1);        // Changed regster map to IFP page 1
    cam_i2c_write(mt9d111_axi_iic, 0xba, 0x97, 0x20);   // RGB Mode, RGB565

    mt9d111_inf_axis_0[1] = 0;

    // allocated the memory for bmp file
    if ((bmp_data=(BMP24FORMAT **)malloc(sizeof(BMP24FORMAT *)*SVGA_VERTICAL_LINES)) == NULL){
        fprintf(stderr, "Can not allocate memory of the first dimension of SVGA_VERTICAL_LINES of bmp_data\n");
        exit(1);
    }
    for (i=0; i<SVGA_VERTICAL_LINES; i++){
        if ((bmp_data[i]=(BMP24FORMAT *)malloc(sizeof(BMP24FORMAT) * SVGA_HORIZONTAL_PIXELS)) == NULL){
            fprintf(stderr, "Can not allocate %d th memory of the first dimension of bmp_data\n", i);
            exit(1);
        }
    }

    char bmp_file[256];

    // w - writed the left and right eye's bmp files.  q - exit.
    c = getc(stdin);
    while(c != 'q'){
        switch ((char)c) {
            case 'w' : // w - writed a bmp files.
                // writed the frame buffer
                file_no++;
                sprintf(bmp_file, "%s%d.bmp", bmp_fn, file_no);
                if ((fbmp=fopen(bmp_file, "wb")) == NULL){
                    fprintf(stderr, "Cannot open %s in binary mode\n", bmp_file);
                    exit(1);
                }
                WriteBMPfile(fbmp, frame_buffer_bmdc, bmp_data);
                fclose(fbmp);

                printf("file No. = %d\n", file_no);

                break;
            case 'e' : // e - writed a same bmp files.
                // writed the frame buffer
                if (file_no == -1)
                    file_no = 0;

                sprintf(bmp_file, "%s%d.bmp", bmp_fn, file_no);
                if ((fbmp=fopen(bmp_file, "wb")) == NULL){
                    fprintf(stderr, "Cannot open %s in binary mode\n", bmp_file);
                    exit(1);
                }
                WriteBMPfile(fbmp, frame_buffer_bmdc, bmp_data);
                fclose(fbmp);

                printf("file No. = %d\n", file_no);

                break;
        }
        c = getc(stdin);
    }

    for(i=0; i<SVGA_VERTICAL_LINES; i++){
        free(bmp_data[i]);
    }
    free(bmp_data);

    munmap((void *)bmdc_axi_lites0, 0x10000);
    munmap((void *)bmdc_axi_lites1, 0x10000);
    munmap((void *)dmaw4gabor_0, 0x10000);
    munmap((void *)mt9d111_inf_axis_0, 0x10000);
    munmap((void *)mt9d111_axi_iic, 0x10000);
    munmap((void *)axis_switch_0, 0x10000);
    munmap((void *)axis_switch_1, 0x10000);
    munmap((void *)axi_gpio_0, 0x10000);
    munmap((void *)frame_buffer_bmdc, 576000);

    close(fd0);
    close(fd1);
    close(fd2);
    close(fd3);
    close(fd4);
    close(fd5);
    close(fd6);
    close(fd7);
    close(fd8);
    close(fd9);

    return(0);
}

int WriteBMPfile(FILE *fbmp, volatile unsigned *frame_buffer, BMP24FORMAT **bmp_data){
    BITMAPFILEHEADER bmpfh; // file header for a bmp file
    BITMAPINFOHEADER bmpih; // INFO header for BMP file

    // Copy the camera color data of the bmp_data (data of BMP when its starts from lower left)
    for (int i=0; i<SVGA_VERTICAL_LINES; i++){
        for (int j=0; j<SVGA_HORIZONTAL_PIXELS; j++){
            bmp_data[(SVGA_VERTICAL_LINES-1)-i][j].red = (frame_buffer[i*SVGA_HORIZONTAL_PIXELS+j]>>16)&0xff;
            bmp_data[(SVGA_VERTICAL_LINES-1)-i][j].green = (frame_buffer[i*SVGA_HORIZONTAL_PIXELS+j]>>8)&0xff;
            bmp_data[(SVGA_VERTICAL_LINES-1)-i][j].blue = (frame_buffer[i*SVGA_HORIZONTAL_PIXELS+j])&0xff;
        }
    }

    // Assign a value to the file header of the BMP file
    bmpfh.bfType = 0x4d42;
    bmpfh.bfSize = SVGA_HORIZONTAL_PIXELS*SVGA_VERTICAL_LINES*3+54;
    bmpfh.bfReserved1 = 0;
    bmpfh.bfReserved2 = 0;
    bmpfh.bfOffBits = 0x36;

    // Assign a value to the INFO header of the BMP file
    bmpih.biSize = 0x28;
    bmpih.biWidth = SVGA_HORIZONTAL_PIXELS;
    bmpih.biHeight = SVGA_VERTICAL_LINES;
    bmpih.biPlanes = 0x1;
    bmpih.biBitCount = 24;
    bmpih.biCompression = 0;
    bmpih.biSizeImage = 0;
    bmpih.biXPixPerMeter = 3779;
    bmpih.biYPixPerMeter = 3779;
    bmpih.biClrUsed = 0;
    bmpih.biClrImporant = 0;

    // Writing of BMP file header
    fwrite(&bmpfh.bfType, sizeof(char), 2, fbmp);
    fwrite(&bmpfh.bfSize, sizeof(long), 1, fbmp);
    fwrite(&bmpfh.bfReserved1, sizeof(short), 1, fbmp);
    fwrite(&bmpfh.bfReserved2, sizeof(short), 1, fbmp);
    fwrite(&bmpfh.bfOffBits, sizeof(long), 1, fbmp);

    // Writing of BMP INFO header
    fwrite(&bmpih, sizeof(BITMAPINFOHEADER), 1, fbmp);

    // Writing of bmp_data
    for (int i=0; i<SVGA_VERTICAL_LINES; i++) {
        for (int j=0; j<SVGA_HORIZONTAL_PIXELS; j++) {
            fputc((int)bmp_data[i][j].blue, fbmp);
            fputc((int)bmp_data[i][j].green, fbmp);
            fputc((int)bmp_data[i][j].red, fbmp);
        }
    }
}

w コマンドだと名前の番号を +1 した名前で BMP ファイルを作るが、e コマンドだと同じ名前で BMP ファイルを書き込む。

gcc -o cam_capture_bmp cam_capture_bmp.cpp
でコンパイルは通って、cam_capture_bmp 実行ファイルができた。

cam_capture_bmp を使うときは、
sudo dtbocfg.rb -i --dts devicetree.dts wl_tracing_cnn
で、デバイスツリーを読み込んで、
sudo chmod 666 /dev/uio*
sudo chmod 666 /dev/udmabuf*
するのだが、chmod を uio_set のシェル・スクリプトに書いておいたので、これを実行する。
su
./uio_set
exit
bmp_files ディレクトリに cam_capture_bmp を入れておいたので、そこに cd する。
cd bmp_files

画像表示ソフトは現在表示している画像が更新したら、表示を自動的に更新してくれるソフトが良いので、そういうソフトを探したところ、Geeqie を使うことにした。

さて、cam_capture_bmp を起動して、BMPファイルを作ってみよう。
./cam_caputre_bmp
で
w
を入力すると、bmp_file0.bmp が作成される。

nautilus の bmp_file0.bmp をダブルクリックすると、Geeqie が起動する。

e
コマンドを押すと、同じ bmp_file0.bmp に画像が保存されて、Geeqie の画像が自動的に更新される。

これで、ちょうど良い場所を見つけてコースの写真を撮ることができるようになった。
なお、nautilus や Geeqie などは、X サーバーをWindows で立ち上げているので、Windows 上で見えている。よって、コースの写真を撮るときはノートパソコンにこれらの画像を表示させてコースの写真を撮ることができるので、とっても便利だと思う。

（追加）
Geeqie のインストールは
sudo apt install geeqie
でできた。