change to Verilog2001

This commit is contained in:
WangXuan95 2023-06-07 20:54:14 +08:00
parent ae9531b1cb
commit c2a5cc679b
10 changed files with 580 additions and 495 deletions

345
README.md
View File

@ -1,8 +1,168 @@
![语言](https://img.shields.io/badge/语言-systemverilog_(IEEE1800_2005)-CAD09D.svg) ![仿真](https://img.shields.io/badge/仿真-iverilog-green.svg) ![部署](https://img.shields.io/badge/部署-quartus-blue.svg) ![部署](https://img.shields.io/badge/部署-vivado-FF1010.svg)
![语言](https://img.shields.io/badge/语言-verilog_(IEEE1364_2001)-9A90FD.svg) ![仿真](https://img.shields.io/badge/仿真-iverilog-green.svg) ![部署](https://img.shields.io/badge/部署-quartus-blue.svg) ![部署](https://img.shields.io/badge/部署-vivado-FF1010.svg)
中文 | [English](#en)
[English](#en) | [中文](#cn)
Hard-PNG
 
<span id="en">Hard-PNG</span>
===========================
FPGA-based streaming **png** image decoder, input png stream, output original pixels.
* Support image width less than 4000, height unlimited.
* **Supports all color types** : Grayscale, Grayscale+A, RGB, Indexed RGB, and RGB+A.
* Only 8bit depth is supported (actually most png images are 8bit depth).
| ![diagram](./figures/diagram.png) |
| :--------------------------------: |
| **Figure1** : diagram of Hard-PNG. |
 
# Background
**png** is the second most common compressed image compression format after **jpg** .
png image files have the **.png** suffix name.
Take [SIM/test_image/img01.png](./SIM/test_image) in this repository as an example, it contains 98 bytes, which are called png stream. We can use [WinHex software](http://www.x-ways.net/winhex/) to view these bytes:
```
0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, ...... , 0xAE, 0x42, 0x60, 0x82
```
After the png stream is decompressed, the original pixels will be generated. This is a small image with only 4 columns and 2 rows, and a total of 8 pixels. The hexadecimal representation of these pixels is as follows. where R, G, B, A represent the red, green, blue and transparent channels of the pixel, respectively.
| | 列 1 | 列 2 | 列 3 | 列 4 |
| :--: | :-----------------: | :-----------------: | :-----------------: | :-----------------: |
| 行 1 | R:FF G:F2 B:00 A:FF | R:ED G:1C B:24 A:FF | R:00 G:00 B:00 A:FF | R:3F G:48 B:CC A:FF |
| 行 2 | R:7F G:7F B:7F A:FF | R:ED G:1C B:24 A:FF | R:FF G:FF B:FF A:FF | R:FF G:AE B:CC A:FF |
 
# Hard-PNG Usage
[hard_png.v](./RTL) in [RTL](./RTL) directory is a module that can input png stream and output decompressed original pixels. Its interface is shown in **Figure2**.
| ![接口图](./figures/interface.png) |
| :--------------------------------: |
| **Figure2** : ports of hard_png. |
## Input png stream
It's easy to use hard_png module. Take the image [SIM/test_image/img01.png](./SIM/test_image) as an example again, such as **Figure3**, before inputting the png stream, a high level pulse must be generated on `istart` (with a width of at least one clock cycle), and then input the png stream through `ivalid` and `ibyte` signals (the png stream of this image has 98 bytes, these 98 bytes must be input to hard_png one by one), among which `ivalid` and `iready` constitutes handshake signals: `ivalid=1` indicates that the user wants to send a byte to hard_png. `iready=1` indicates that hard_png is ready to accept a byte. Only when `ivalid` and `iready` both = 1 at the same time, the handshake is successful, and `ibyte` is successfully input into hard_png.
| ![输入时序图](./figures/wave1.png) |
| :---------------------------------------: |
| **Figure3** : input waveform of hard_png. |
When it finish to input one png image, the next png image can be input immediately or later (that is, pulse the `istart` again, and then input the next png stream).
## Output image information and pixels
At the same time of inputting the png stream, the decompression result of this image (including the basic information of this image and the original pixels) will be output from the module, as shown in **Figure4**, first of all, `ostart` signal will appear A high-level pulse for one cycle, and `colortype`, `width`, and `height` will be valid simutinously, where:
- `width`, `height` are the width and height of the image.
- `colortype` is the color type of the png image, with the meaning in the table below.
| colortype | 3'd0 | 3'd1 | 3'd2 | 3'd3 | 3d4 |
| :-------: | :-----------: | :---------: | :-----------: | :-----: | :-----------: |
| meaning | grayscale | grayscale+A | RGB | RGB+A | indexed RGB |
| remark | R=G=BA=0xFF | R=G=B≠A | R≠G≠BA=0xFF | R≠G≠B≠A | R≠G≠BA=0xFF |
Then, `ovalid=1` means that there is a pixel output in this cycle, meanwhile, the R, G, B, A channels of this pixel will appear on `opixelr`, `opixelg`, `opixelb`, and `opixela` signals respectively.
| ![输出时序图](./figures/wave2.png) |
| :----------------------------------------: |
| **Figure4** : output waveform of hard_png. |
 
# RTL Simulation
Simulation related files are in the [SIM](./SIM) folder, where:
- 14 png image files of different sizes and different color types are provided in [test_image](./SIM/test_image) folder.
- tb_hard_png.v is the testbench code that compresses these images in sequence and writes the result (raw pixels) to txt files.
- tb_hard_png_run_iverilog.bat is the command script to run iverilog simulation.
- validation.py (a Python code) compares the simulation output with the result of the software png decoding to verify the correctness.
Before using iverilog for simulation, you need to install iverilog , see: [iverilog_usage](https://github.com/WangXuan95/WangXuan95/blob/main/iverilog_usage/iverilog_usage.md)
Then double-click tb_hard_png_run_iverilog.bat to run the simulation, which will run for about 30 minutes (it can be forced to close halfway, but the generated simulation waveform is incomplete).
After the simulation runs, you can open the generated dump.vcd file to view the waveform.
In addition, each png image will generate a corresponding .txt file, which contains the decoding result. For example, img01.png generates out01.txt, which contains the decoded 8 pixel values:
```
decode result: colortype:3 width:4 height:2
fff200ff ed1c24ff 000000ff 3f48ccff 7f7f7fff ed1c24ff ffffffff ffaec9ff
```
## Correctness verification
In order to verify that the decompression results are correct, I provide a Python program [validation.py](./SIM), which can decompress the .png file and compares it with each pixel in the .txt file generated by the simulation. If the comparison results are the same, the validation passed.
In order to run validation.py , you need to install Python3 and its [numpy](https://pypi.org/project/numpy/) and [PIL](https://pypi.org/project/Pillow/) libraries.
Then, run validation.py by this command:
```
python validation.py test_image/img03.png out03.txt
```
The meaning of this command is: Compare each pixel in [out03.txt]() to see if it matches [test_image/img03.png](). The print is as follows (indicating that the verification passed):
```
size1= (400, 4)
size2= (400, 4)
total 400 pixels validation successful!!
```
 
# FPGA Deployment
## FPGA resource usage
| FPGA chip | Logic | Logic (%) | BRAM | BRAM (%) | max clk freq. (under timing closure) |
| :----------------------------: | :------: | :-------: | :--------: | :------: | :----------------------------------: |
| Xilinx Artix-7 XC7A35T | 2662×LUT | 13% | 22×BRAM36K | 44% | 66.6 Mhz |
| Altera Cyclone IV EP4CE40F23C6 | 5277×LE | 13% | 427kbit | 37% | 56 MHz |
## Performance
When running at 50MHz, according to the number of clock cycles consumed by each image during simulation, we can calculate the performance.
For example, for some of the test files I provided, performance examples are shown below.
| png file | color type | image size | pixel count | png stream size | cycle count | time |
| :-------: | :---------: | :--------: | :---------: | :-------------: | :---------: | :---: |
| img05.png | RGB | 300x256 | 76800 | 96536 | 1105702 | 23ms |
| img06.png | Grayscale | 300x263 | 78900 | 37283 | 395335 | 8ms |
| img09.png | RGBA | 300x263 | 78900 | 125218 | 1382303 | 28ms |
| img10.png | Indexed RGB | 631x742 | 468202 | 193489 | 2374224 | 48ms |
| img14.png | Indexed RGB | 1920x1080 | 2073600 | 818885 | 10177644 | 204ms |
 
# Reference
* [upng](https://github.com/elanthis/upng): A lightweight C language png decoding library.
* [TinyPNG](https://tinypng.com/): A lossy compression tool using png's indexed RGB.
* [PNG Specification](https://www.w3.org/TR/REC-png.pdf).
 
 
 
 
<span id="cn">Hard-PNG</span>
===========================
基于FPGA的流式的 **png** 图象解码器,输入 png 码流,输出原始像素
@ -15,29 +175,29 @@ Hard-PNG
| :----: |
| **图1** : Hard-PNG 原理框图 |
 
# 背景知识
png 是仅次于jpg的第二常见的图象压缩格式。png支持透明通道A通道支持无损压缩支持索引RGB基于调色板的有损压缩在色彩丰富的数码照片中png只能获得1~4倍的压缩比。在人工合成图例如平面设计png能获得10倍以上的压缩比
png 是仅次于jpg的第二常见的图象压缩格式。png支持透明通道A通道支持无损压缩支持索引RGB基于调色板的有损压缩png 图像文件的扩展名为 .png
png 图像文件的扩展名为 .png 。以本库中的 SIM/test_image/img01.png 为例它包含98字节这98字节就称为 png 码流。我们可以用 [WinHex软件](http://www.x-ways.net/winhex/) 查看到这些字节:
以本库中的 SIM/test_image/img01.png 为例它包含98字节这98字节就称为 png 码流。我们可以用 [WinHex软件](http://www.x-ways.net/winhex/) (Windows上) 或用 hexdump 命令 (linux上) 查看到这些字节:
```
0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, ...... , 0xAE, 0x42, 0x60, 0x82
```
该png码流解码后会产生原始像素这是个小图像,只有4列2行共8个像素这些像素的十六进制表示如下表。其中R, G, B, A分别代表像素的红、绿、蓝、透明通道。
该png码流解码后会产生原始像素该图像只有4列2行共8个像素这些像素的十六进制表示如下表。其中R, G, B, A分别代表像素的红、绿、蓝、透明通道。
| | 列 1 | 列 2 | 列 3 | 列 4 |
| :---: | :---: | :---: | :---: | :---: |
| 行 1 | R:FF G:F2 B:00 A:FF | R:ED G:1C B:24 A:FF | R:00 G:00 B:00 A:FF | R:3F G:48 B:CC A:FF |
| 行 2 | R:7F G:7F B:7F A:FF | R:ED G:1C B:24 A:FF | R:FF G:FF B:FF A:FF | R:FF G:AE B:CC A:FF |
 
# 使用 Hard-PNG
RTL 目录中的 hard_png.sv 是一个能够输入 png 码流,输出解压后的像素的模块,它的接口如**图2**所示。
RTL 目录中的 hard_png.v 是一个能够输入 png 码流,输出解压后的像素的模块,它的接口如**图2**所示。
| ![接口图](./figures/interface.png) |
| :----: |
@ -71,14 +231,14 @@ hard_png 的使用方法很简单,以 SIM/test_image/img01.png 这张图像为
| :----: |
| **图4** : hard_png 的输出波形图 |
 
# 仿真
仿真相关的东西都在 SIM 文件夹中,其中:
- test_image 中提供 14 张不同尺寸,不同颜色类型的 png 图像文件。
- tb_hard_png.sv 是仿真代码,它会依次进行这些图像的压缩,然后把结果(原始像素)写入 txt 文件中。
- tb_hard_png.v 是仿真代码,它会依次进行这些图像的压缩,然后把结果(原始像素)写入 txt 文件中。
- tb_hard_png_run_iverilog.bat 包含了运行 iverilog 仿真的命令。
- validation.py Python代码对仿真输出和软件 png 解码的结果进行比对,验证正确性。
@ -116,30 +276,30 @@ size2= (400, 4)
total 400 pixels validation successful!!
```
 
# 部署信息
## FPGA 资源消耗
| FPGA 型号 | LUT | LUT(%) | FF | FF(%) | Logic | Logic(%) | BRAM | BRAM(%) |
| :----------------------------: | :--: | :----: | :--: | :---: | :---: | :------: | :-----: | :-----: |
| Xilinx Artix-7 XC7A35T | 2581 | 13% | 2253 | 5% | - | - | 792kbit | 44% |
| Altera Cyclone IV EP4CE40F23C6 | - | - | - | - | 4682 | 11% | 427kbit | 37% |
| FPGA 型号 | Logic | Logic (%) | BRAM | BRAM (%) | 最高频率 (刚好时序收敛) |
| :----------------------------: | :------: | :-------: | :--------: | :------: | :---------------------: |
| Xilinx Artix-7 XC7A35T | 2662×LUT | 13% | 22×BRAM36K | 44% | 66.6 MHz |
| Altera Cyclone IV EP4CE40F23C6 | 5277×LE | 13% | 427kbit | 37% | 56 MHz |
## 性能
在 Altera Cyclone IV EP4CE40F23C6 上部署 hard_png ,时钟频率= 50MHz (正好时序收敛)。根据仿真时每个图像消耗的时钟周期数,可以算出压缩图像时的性能,举例如下表。
当运行在 50MHz 时,根据仿真时每个图像消耗的时钟周期数,可以算出压缩图像时的性能。例如对于部分我提供的测试文件,性能举例如下表。
| 文件名 | 颜色类型 | 图象长宽 | 像素数 | png 码流大小 (字节) | 时钟周期数 | 消耗时间 |
| :-----------: | :----------: | :----------: | :--------------: | :---------------: | :---------------: | ------------- |
| :-----------: | :----------: | :----------: | :--------------: | :---------------: | :---------------: | :-----------: |
| img05.png | RGB | 300x256 | 76800 | 96536 | 1105702 | 23ms |
| img06.png | 灰度 | 300x263 | 78900 | 37283 | 395335 | 8ms |
| img09.png | RGBA | 300x263 | 78900 | 125218 | 1382303 | 28ms |
| img10.png | 索引RGB | 631x742 | 468202 | 193489 | 2374224 | 48ms |
| img14.png | 索引RGB | 1920x1080 | 2073600 | 818885 | 10177644 | 204ms |
 
# 参考链接
@ -150,150 +310,3 @@ total 400 pixels validation successful!!
<span id="en">Hard-PNG</span>
===========================
FPGA-based streaming **png** image decoder, input png stream, output original pixels.
* Support image width less than 4000, height unlimited.
* **Supports all color types** : Grayscale, Grayscale+A, RGB, Indexed RGB, and RGB+A.
* Only 8bit depth is supported (actually most png images are 8bit depth).
| ![diagram](./figures/diagram.png) |
| :---------------------------------------: |
| **Figure1** : Hard-PNG schematic diagram. |
# Background
**png** is the second most common compressed image format after **jpg** . png supports transparency channel (A channel), lossless compression, and indexed RGB (palette-based lossy compression). In colorful digital photos, png can only get 1 to 4 times the lossless compression ratio. In synthetic images (such as graphic design), png can achieve more than 10 times the lossless compression ratio.
png image files have the **.png** suffix name. Take [SIM/test_image/img01.png](./SIM/test_image) in this repository as an example, it contains 98 bytes, which are called png stream. We can use [WinHex software](http://www.x-ways.net/winhex/) to view these bytes:
```
0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, ...... , 0xAE, 0x42, 0x60, 0x82
```
After the png stream is decompressed, the original pixels will be generated. This is a small image with only 4 columns and 2 rows, and a total of 8 pixels. The hexadecimal representation of these pixels is as follows. where R, G, B, A represent the red, green, blue and transparent channels of the pixel, respectively.
| | 列 1 | 列 2 | 列 3 | 列 4 |
| :--: | :-----------------: | :-----------------: | :-----------------: | :-----------------: |
| 行 1 | R:FF G:F2 B:00 A:FF | R:ED G:1C B:24 A:FF | R:00 G:00 B:00 A:FF | R:3F G:48 B:CC A:FF |
| 行 2 | R:7F G:7F B:7F A:FF | R:ED G:1C B:24 A:FF | R:FF G:FF B:FF A:FF | R:FF G:AE B:CC A:FF |
# Hard-PNG Usage
[hard_png.sv](./RTL) in [RTL](./RTL) directory is a module that can input png stream and output decompressed original pixels. Its interface is shown in **Figure2**.
| ![接口图](./figures/interface.png) |
| :----------------------------------: |
| **Figure2** : interface of hard_png. |
## Input png stream
It's easy to use hard_png module. Take the image [SIM/test_image/img01.png](./SIM/test_image) as an example again, such as **Figure3**, before inputting the png stream, a high level pulse must be generated on `istart` (with a width of at least one clock cycle), and then input the png stream through `ivalid` and `ibyte` signals (the png stream of this image has 98 bytes, these 98 bytes must be input to hard_png one by one), among which `ivalid` and `iready` constitutes handshake signals: `ivalid=1` indicates that the user wants to send a byte to hard_png. `iready=1` indicates that hard_png is ready to accept a byte. Only when `ivalid` and `iready` both = 1 at the same time, the handshake is successful, and `ibyte` is successfully input into hard_png.
| ![输入时序图](./figures/wave1.png) |
| :---------------------------------------: |
| **Figure3** : input waveform of hard_png. |
When it finish to input one png image, the next png image can be input immediately or later (that is, pulse the `istart` again, and then input the next png stream).
## Output image information and pixels
At the same time of inputting the png stream, the decompression result of this image (including the basic information of this image and the original pixels) will be output from the module, as shown in **Figure4**, first of all, `ostart` signal will appear A high-level pulse for one cycle, and `colortype`, `width`, and `height` will be valid simutinously, where:
- `width`, `height` are the width and height of the image.
- `colortype` is the color type of the png image, with the meaning in the table below.
| colortype | 3'd0 | 3'd1 | 3'd2 | 3'd3 | 3d4 |
| :-------: | :-----------: | :---------: | :-----------: | :-----: | :-----------: |
| meaning | grayscale | grayscale+A | RGB | RGB+A | indexed RGB |
| remark | R=G=BA=0xFF | R=G=B≠A | R≠G≠BA=0xFF | R≠G≠B≠A | R≠G≠BA=0xFF |
Then, `ovalid=1` means that there is a pixel output in this cycle, meanwhile, the R, G, B, A channels of this pixel will appear on `opixelr`, `opixelg`, `opixelb`, and `opixela` signals respectively.
| ![输出时序图](./figures/wave2.png) |
| :----------------------------------------: |
| **Figure4** : output waveform of hard_png. |
# RTL Simulation
Simulation related files are in the [SIM](./SIM) folder, where:
- 14 png image files of different sizes and different color types are provided in [test_image](./SIM/test_image) folder.
- tb_hard_png.sv is the testbench code that compresses these images in sequence and writes the result (raw pixels) to txt files.
- tb_hard_png_run_iverilog.bat is the command script to run iverilog simulation.
- validation.py (Python code) compares the simulation output with the result of the software png decoding to verify the correctness.
Before using iverilog for simulation, you need to install iverilog , see: [iverilog_usage](https://github.com/WangXuan95/WangXuan95/blob/main/iverilog_usage/iverilog_usage.md)
Then double-click tb_hard_png_run_iverilog.bat to run the simulation, which will run for about 30 minutes (it can be forced to close halfway, but the generated simulation waveform is incomplete).
After the simulation runs, you can open the generated dump.vcd file to view the waveform.
In addition, each png image will generate a corresponding .txt file, which contains the decoding result. For example, img01.png generates out01.txt, which contains the decoded 8 pixel values:
```
decode result: colortype:3 width:4 height:2
fff200ff ed1c24ff 000000ff 3f48ccff 7f7f7fff ed1c24ff ffffffff ffaec9ff
```
## Correctness verification
In order to verify that the decompression results are correct, I provide a Python program [validation.py](./SIM), which can decompress the .png file and compares it with each pixel in the .txt file generated by the simulation. If the comparison results are the same, the validation passed.
In order to run validation.py , you need to install Python3 and its [numpy](https://pypi.org/project/numpy/) and [PIL](https://pypi.org/project/Pillow/) libraries.
Then, run validation.py by this command:
```
python validation.py test_image/img03.png out03.txt
```
The meaning of this command is: Compare each pixel in [out03.txt]() to see if it matches [test_image/img03.png](). The print is as follows (indicating that the verification passed):
```
size1= (400, 4)
size2= (400, 4)
total 400 pixels validation successful!!
```
# FPGA Deployment
## FPGA resource usage
| FPGA part | LUT | LUT(%) | FF | FF(%) | Logic | Logic(%) | BRAM | BRAM(%) |
| :----------------------------: | :--: | :----: | :--: | :---: | :---: | :------: | :-----: | :-----: |
| Xilinx Artix-7 XC7A35T | 2581 | 13% | 2253 | 5% | - | - | 792kbit | 44% |
| Altera Cyclone IV EP4CE40F23C6 | - | - | - | - | 4682 | 11% | 427kbit | 37% |
## Performance
I deploy hard_png on Altera Cyclone IV EP4CE40F23C6 and get clock frequency = 50MHz (just reach timing closure). According to the number of clock cycles consumed by each image during simulation, we can calculate the performance as shown in the following table.
| png file | color type | image size | pixel count | png stream size | cycle count | time |
| :-------: | :---------: | :--------: | :---------: | :-------------: | :---------: | ----- |
| img05.png | RGB | 300x256 | 76800 | 96536 | 1105702 | 23ms |
| img06.png | Grayscale | 300x263 | 78900 | 37283 | 395335 | 8ms |
| img09.png | RGBA | 300x263 | 78900 | 125218 | 1382303 | 28ms |
| img10.png | Indexed RGB | 631x742 | 468202 | 193489 | 2374224 | 48ms |
| img14.png | Indexed RGB | 1920x1080 | 2073600 | 818885 | 10177644 | 204ms |
# Reference
* [upng](https://github.com/elanthis/upng): A lightweight C language png decoding library.
* [TinyPNG](https://tinypng.com/): A lossy compression tool using png's indexed RGB.
* [PNG Specification](https://www.w3.org/TR/REC-png.pdf).

File diff suppressed because one or more lines are too long

View File

@ -2,7 +2,7 @@
//--------------------------------------------------------------------------------------------------------
// Module : huffman_builder
// Type : synthesizable, IP's sub module
// Standard: SystemVerilog 2005 (IEEE1800-2005)
// Standard: Verilog 2001 (IEEE1364-2001)
//--------------------------------------------------------------------------------------------------------
module huffman_builder #(
@ -18,10 +18,17 @@ module huffman_builder #(
rdaddr, rddata
);
function automatic integer clogb2(input integer val);
function integer clogb2;
input integer val;
//function automatic integer clogb2(input integer val);
integer valtmp;
begin
valtmp = val;
for(clogb2=0; valtmp>0; clogb2=clogb2+1) valtmp = valtmp>>1;
for (clogb2=0; valtmp>0; clogb2=clogb2+1)
valtmp = valtmp>>1;
end
endfunction
input rstn;
@ -46,46 +53,48 @@ wire done;
wire [clogb2(2*NUMCODES-1)-1:0] rdaddr;
reg [ OUTWIDTH-1:0] rddata;
reg [clogb2(NUMCODES)-1:0] blcount [BITLENGTH];
reg [ (1<<CODEBITS)-1:0] nextcode [BITLENGTH+1];
reg [clogb2(NUMCODES)-1:0] blcount [0 : BITLENGTH-1];
reg [ (1<<CODEBITS)-1:0] nextcode [0 : BITLENGTH];
initial for(int i=0; i< BITLENGTH; i++) blcount[i] = '0;
initial for(int i=0; i<=BITLENGTH; i++) nextcode[i] = '0;
integer i;
initial for(i=0; i< BITLENGTH; i=i+1) blcount[i] = 0;
initial for(i=0; i<=BITLENGTH; i=i+1) nextcode[i] = 0;
reg clear_tree2d = 1'b0;
reg build_tree2d = 1'b0;
reg [clogb2(BITLENGTH)-1:0] idx = '0;
reg [clogb2(2*NUMCODES-1)-1:0] clearidx = '0;
reg [ clogb2(NUMCODES)-1:0] nn='0, nnn, lnn='0;
reg [CODEBITS-1:0] ii='0, lii='0;
reg [CODEBITS-1:0] blenn, blen = '0;
reg [clogb2(BITLENGTH)-1:0] idx = 0;
reg [clogb2(2*NUMCODES-1)-1:0] clearidx = 0;
reg [ clogb2(NUMCODES)-1:0] nn=0, nnn, lnn=0;
reg [CODEBITS-1:0] ii=0, lii=0;
reg [CODEBITS-1:0] blenn, blen = 0;
wire [(1<<CODEBITS)-1:0] tree1d = nextcode[blen];
wire islast = (blen==0 || ii==0);
reg [clogb2(2*NUMCODES-1)-1:0] nodefilled = '0;
reg [clogb2(2*NUMCODES-1)-1:0] ntreepos, treepos='0;
reg [clogb2(2*NUMCODES-1)-1:0] nodefilled = 0;
reg [clogb2(2*NUMCODES-1)-1:0] ntreepos, treepos=0;
wire [clogb2(2*NUMCODES-1)-1:0] ntpos= {ntreepos[clogb2(2*NUMCODES-1)-2:0], tree1d[ii]};
reg [clogb2(2*NUMCODES-1)-1:0] tpos = '0;
reg [clogb2(2*NUMCODES-1)-1:0] tpos = 0;
reg rdfilled;
reg valid = 1'b0;
wire [OUTWIDTH-1:0] wrtree2d = (lii==0) ? lnn : nodefilled + (clogb2(2*NUMCODES-1))'(NUMCODES);
wire [OUTWIDTH-1:0] wrtree2d = (lii==0) ? lnn : (nodefilled + NUMCODES);
reg alldone = 1'b0;
assign done = alldone & run;
always @ (posedge clk or negedge rstn)
if(~rstn) begin
valid <= '0;
treepos <= '0;
tpos <= '0;
lii <= '0;
lnn <= '0;
valid <= 0;
treepos <= 0;
tpos <= 0;
lii <= 0;
lnn <= 0;
end else begin
if(istart) begin
valid <= '0;
treepos <= '0;
tpos <= '0;
lii <= '0;
lnn <= '0;
valid <= 0;
treepos <= 0;
tpos <= 0;
lii <= 0;
lnn <= 0;
end else begin
valid <= build_tree2d & nn<NUMCODES & blen>0;
treepos <= ntreepos;
@ -97,131 +106,139 @@ always @ (posedge clk or negedge rstn)
always @ (posedge clk or negedge rstn)
if(~rstn)
blen <= '0;
blen <= 0;
else begin
if(istart)
blen <= '0;
blen <= 0;
else if(islast)
blen <= blenn;
end
always @ (posedge clk or negedge rstn)
if(~rstn) begin
for(int i=0; i<BITLENGTH; i++)
blcount[i] <= '0;
for(i=0; i<BITLENGTH; i=i+1)
blcount[i] <= 0;
end else begin
if(istart | done) begin
for(int i=0; i<BITLENGTH; i++)
blcount[i] <= '0;
for(i=0; i<BITLENGTH; i=i+1)
blcount[i] <= 0;
end else begin
if(wren && wrdata<BITLENGTH)
blcount[wrdata] <= blcount[wrdata] + (clogb2(NUMCODES))'(1);
blcount[wrdata] <= blcount[wrdata] + 1;
end
end
always_comb
always @ (*)
if(build_tree2d)
nnn = (nn<NUMCODES && islast) ? nn + (clogb2(NUMCODES))'(1) : nn;
nnn = (nn<NUMCODES && islast) ? (nn + 1) : nn;
else if (idx<BITLENGTH)
nnn = 64'hFFFF_FFFF_FFFF_FFFF;
else
nnn = (idx<BITLENGTH) ? '1 : '0;
nnn = 0;
always @ (posedge clk or negedge rstn)
if(~rstn)
nn <= '0;
else
nn <= istart ? '0 : nnn;
nn <= 0;
else begin
if (istart)
nn <= 0;
else
nn <= nnn;
end
always @ (posedge clk or negedge rstn)
if(~rstn) begin
for(int i=0; i<=BITLENGTH; i++) nextcode[i] <= '0;
for(i=0; i<=BITLENGTH; i=i+1) nextcode[i] <= 0;
alldone <= 1'b0;
ii <= '0;
idx <= '0;
ii <= 0;
idx <= 0;
build_tree2d <= 1'b0;
clearidx <= '0;
clearidx <= 0;
clear_tree2d <= 1'b0;
end else begin
nextcode[0] <= '0;
nextcode[0] <= 0;
alldone <= 1'b0;
if(istart | ~run) begin
if(istart) for(int i=0; i<=BITLENGTH; i++) nextcode[i] <= '0;
ii <= '0;
idx <= '0;
if(istart) for(i=0; i<=BITLENGTH; i=i+1) nextcode[i] <= 0;
ii <= 0;
idx <= 0;
build_tree2d <= 1'b0;
clearidx <= '0;
clearidx <= 0;
clear_tree2d <= 1'b0;
end else if(run) begin
if(~clear_tree2d) begin
if( clearidx >= (clogb2(2*NUMCODES-1))'(2*NUMCODES-1) )
if ( clearidx >= (2*NUMCODES-1) )
clear_tree2d <= 1'b1;
clearidx <= clearidx + (clogb2(2*NUMCODES-1))'(1);
clearidx <= clearidx + 1;
end else if(build_tree2d) begin
if(nn < NUMCODES) begin
if(islast) begin
ii <= blenn - (CODEBITS)'(1);
ii <= blenn - 1;
if(blen>0)
nextcode[blen] <= tree1d + (1<<CODEBITS)'(1);
nextcode[blen] <= tree1d + 1;
end else
ii <= ii - (CODEBITS)'(1);
ii <= ii - 1;
end else
alldone <= 1'b1;
end else begin
if(idx<BITLENGTH) begin
idx <= idx + (clogb2(BITLENGTH))'(1);
nextcode[idx+1] <= ( ( nextcode[idx] + ((1<<CODEBITS)'(blcount[idx])) ) << 1 );
idx <= idx + 1;
nextcode[idx+1] <= ( ( nextcode[idx] + blcount[idx] ) << 1 );
end else begin
ii <= blen - (CODEBITS)'(1);
ii <= blen - 1;
build_tree2d <= 1'b1;
end
end
end
end
always_comb
always @ (*)
if(~run)
ntreepos = 0;
else if(valid) begin
if(~rdfilled)
ntreepos = (clogb2(2*NUMCODES-1))'(rddata) - (clogb2(2*NUMCODES-1))'(NUMCODES);
ntreepos = rddata - NUMCODES;
else if (lii==0)
ntreepos = 0;
else
ntreepos = (lii==0) ? '0 : nodefilled;
ntreepos = nodefilled;
end else
ntreepos = treepos;
always @ (posedge clk or negedge rstn)
if(~rstn) begin
nodefilled <= '0;
nodefilled <= 0;
end else begin
if(istart)
nodefilled <= '0;
nodefilled <= 0;
else if(~run)
nodefilled <= (clogb2(2*NUMCODES-1))'(1);
nodefilled <= 1;
else if(valid & rdfilled & lii>0)
nodefilled <= nodefilled + (clogb2(2*NUMCODES-1))'(1);
nodefilled <= nodefilled + 1;
end
reg [CODEBITS-1:0] mem_huffman_bitlens [NUMCODES];
reg [CODEBITS-1:0] mem_huffman_bitlens [0 : NUMCODES-1];
always @ (posedge clk)
if(wren)
mem_huffman_bitlens[wraddr] <= wrdata;
wire [clogb2(NUMCODES-1)-1:0] mem_rdaddr = (clogb2(NUMCODES-1))'(nnn) + (clogb2(NUMCODES-1))'(1);
wire [clogb2(NUMCODES-1)-1:0] mem_rdaddr = nnn + 1;
always @ (posedge clk)
blenn <= mem_huffman_bitlens[mem_rdaddr];
reg [OUTWIDTH:0] mem_tree2d [2*NUMCODES];
reg [OUTWIDTH:0] mem_tree2d [0 : 2*NUMCODES-1];
always @ (posedge clk)
if( ~clear_tree2d | (valid & rdfilled) )
mem_tree2d[ (clogb2(2*NUMCODES-1))'(~clear_tree2d ? clearidx : tpos ) ] <= ~clear_tree2d ? {1'b1, (OUTWIDTH)'(0)} : {1'b0, wrtree2d};
mem_tree2d[ (~clear_tree2d ? clearidx : tpos ) ] <= ~clear_tree2d ? {1'b1, {(OUTWIDTH){1'b0}}} : {1'b0, wrtree2d};
always @ (posedge clk)
{rdfilled, rddata} <= mem_tree2d[ (clogb2(2*NUMCODES-1))'(alldone ? rdaddr : ntpos ) ];
{rdfilled, rddata} <= mem_tree2d[ (alldone ? rdaddr : ntpos ) ];
endmodule

View File

@ -2,7 +2,7 @@
//--------------------------------------------------------------------------------------------------------
// Module : huffman_decoder
// Type : synthesizable, IP's sub module
// Standard: SystemVerilog 2005 (IEEE1800-2005)
// Standard: Verilog 2001 (IEEE1364-2001)
//--------------------------------------------------------------------------------------------------------
module huffman_decoder #(
@ -10,18 +10,24 @@ module huffman_decoder #(
parameter OUTWIDTH = 10
)(
rstn, clk,
istart,
ien, ibit,
istart, ien, ibit,
oen, ocode,
rdaddr, rddata
);
function automatic integer clogb2(input integer val);
function integer clogb2;
input integer val;
//function automatic integer clogb2(input integer val);
integer valtmp;
begin
valtmp = val;
for(clogb2=0; valtmp>0; clogb2=clogb2+1) valtmp = valtmp>>1;
for (clogb2=0; valtmp>0; clogb2=clogb2+1)
valtmp = valtmp>>1;
end
endfunction
input rstn, clk;
input istart, ien, ibit;
output oen;
@ -32,37 +38,41 @@ input [ OUTWIDTH-1:0] rddata;
wire rstn, clk;
wire istart, ien, ibit;
reg oen = 1'b0;
reg [ OUTWIDTH-1:0] ocode = '0;
reg [ OUTWIDTH-1:0] ocode = 0;
wire [clogb2(2*NUMCODES-1)-1:0] rdaddr;
wire [ OUTWIDTH-1:0] rddata;
reg [clogb2(2*NUMCODES-1)-2:0] tpos = '0;
reg [clogb2(2*NUMCODES-1)-2:0] tpos = 0;
wire [clogb2(2*NUMCODES-1)-2:0] ntpos;
reg ienl = 1'b0;
assign rdaddr = {ntpos, ibit};
assign ntpos = ienl ? (clogb2(2*NUMCODES-1)-1)'(rddata<(OUTWIDTH)'(NUMCODES) ? '0 : rddata-(OUTWIDTH)'(NUMCODES)) : tpos;
assign ntpos = ienl ? ((rddata<NUMCODES) ? 0 : (rddata-NUMCODES)) : tpos;
always @ (posedge clk or negedge rstn)
if(~rstn)
ienl <= '0;
ienl <= 1'b0;
else
ienl <= istart ? '0 : ien;
ienl <= istart ? 1'b0 : ien;
always @ (posedge clk or negedge rstn)
if(~rstn)
tpos <= '0;
else
tpos <= istart ? '0 : ntpos;
tpos <= 0;
else begin
if (istart)
tpos <= 0;
else
tpos <= ntpos;
end
always_comb
always @ (*)
if(ienl && rddata<NUMCODES) begin
oen = 1'b1;
ocode = rddata;
end else begin
oen = 1'b0;
ocode = '0;
ocode = 0;
end
endmodule

View File

@ -2,7 +2,7 @@
//--------------------------------------------------------------------------------------------------------
// Module : tb_hard_png
// Type : simulation, top
// Standard: SystemVerilog 2005 (IEEE1800-2005)
// Standard: Verilog 2001 (IEEE1364-2001)
// Function: testbench for hard_png
//--------------------------------------------------------------------------------------------------------
@ -18,7 +18,7 @@
module tb_hard_png ();
initial $dumpvars(0, tb_hard_png);
initial $dumpvars(1, tb_hard_png);
reg rstn = 1'b0;
@ -28,10 +28,10 @@ initial begin repeat(4) @(posedge clk); rstn<=1'b1; end
reg istart = '0;
reg istart = 1'b0;
reg ivalid = 1'b0;
wire iready;
reg [ 7:0] ibyte = '0;
reg [ 7:0] ibyte = 0;
wire ostart;
wire [ 2:0] colortype;
@ -66,14 +66,14 @@ hard_png hard_png_i (
int fptxt = 0, fppng = 0;
integer fptxt = 0, fppng = 0;
reg [256*8:1] fname_png;
reg [256*8:1] fname_txt;
int png_no = 0;
int txt_no = 0;
int cyccnt = 0;
int bytecnt = 0;
integer png_no = 0;
integer txt_no = 0;
integer ii;
integer cyccnt = 0;
integer bytecnt = 0;
initial begin
while(~rstn) @(posedge clk);
@ -105,9 +105,9 @@ initial begin
end
if( ivalid & iready ) begin
ibyte <= $fgetc(fppng);
bytecnt++;
bytecnt = bytecnt + 1;
end
cyccnt++;
cyccnt = cyccnt + 1;
end
ivalid <= 1'b0;
@ -131,7 +131,7 @@ initial begin
$finish;
end
for(int ii=0; ii<width*height; ii++) begin
for(ii=0; ii<width*height; ii=ii+1) begin
@ (posedge clk);
while(~ovalid) @ (posedge clk);
$fwrite(fptxt, "%02x%02x%02x%02x ", opixelr, opixelg, opixelb, opixela);

View File

@ -1,5 +1,5 @@
del sim.out dump.vcd
iverilog -g2005-sv -o sim.out tb_hard_png.sv ../RTL/hard_png.sv ../RTL/huffman_builder.sv ../RTL/huffman_decoder.sv
iverilog -g2001 -o sim.out tb_hard_png.v ../RTL/hard_png.v ../RTL/huffman_builder.v ../RTL/huffman_decoder.v
vvp -n sim.out
del sim.out
pause

Binary file not shown.

Before

Width:  |  Height:  |  Size: 13 KiB

After

Width:  |  Height:  |  Size: 8.2 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 26 KiB

After

Width:  |  Height:  |  Size: 11 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 15 KiB

After

Width:  |  Height:  |  Size: 5.6 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 33 KiB

After

Width:  |  Height:  |  Size: 12 KiB