release v1.0

This commit is contained in:
WangXuan95 2023-03-03 20:48:37 +08:00
commit a03807ee47
15 changed files with 2994 additions and 0 deletions

2
.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
**/vivado
**/quartus

296
README.md Normal file
View File

@ -0,0 +1,296 @@
![语言](https://img.shields.io/badge/语言-systemverilog_(IEEE1800_2005)-CAD09D.svg) ![仿真](https://img.shields.io/badge/仿真-iverilog-green.svg) ![部署](https://img.shields.io/badge/部署-quartus-blue.svg) ![部署](https://img.shields.io/badge/部署-vivado-FF1010.svg)
中文 | [English](#en)
FPGA JPEG-LS image compressor
===========================
基于 **FPGA** 的流式的 **JPEG-LS** 图象压缩器,特点是:
* 用于压缩 **8bit** 的灰度图像。
* 可选**无损模式**,即 NEAR=0 。
* 可选**有损模式**NEAR=1~7 可调。
* 图像宽度取值范围为 [5,16384],高度取值范围为 [1,16384]。
* 极简流式输入输出。
# 背景知识
**JPEG-LS** (简称**JLS**)是一种无损/有损的图像压缩算法,其无损模式的压缩率相当优异,优于 Lossless-JPEG、Lossless-JPEG2000、Lossless-JPEG-XR、FELICES 等。**JPEG-LS** 用压缩前后的像素的最大差值(**NEAR**值)来控制失真,无损模式下 **NEAR=0**;有损模式下**NEAR>0****NEAR** 越大,失真越大,压缩率也越大。**JPEG-LS** 压缩图像的文件后缀是 .**jls** 。
# 使用方法
RTL 目录中的 [**jls_encoder.sv**](./RTL/jls_encoder.sv) 是用户可以调用的 JPEG-LS 压缩模块,它输入图像原始像素,输出 JPEG-LS 压缩流。
## 模块参数
**jls_encoder** 只有一个参数:
```verilog
parameter logic [2:0] NEAR
```
决定了 **NEAR** 值,取值为 3'd0 时,工作在无损模式;取值为 3'd1~3'd7 时,工作在有损模式。
## 模块信号
**jls_encoder** 的输入输出信号描述如下表。
| 信号名称 | 全称 | 方向 | 宽度 | 描述 |
| :---: | :---: | :---: | :---: | :--- |
| rstn | 同步复位 | input | 1bit | 当时钟上升沿时若 rstn=0模块复位正常使用时 rstn=1 |
| clk | 时钟 | input | 1bit | 时钟,所有信号都应该于 clk 上升沿对齐。 |
| i_sof | 图像开始 | input | 1bit | 当需要输入一个新的图像时保持至少368个时钟周期的 i_sof=1 |
| i_w | 图像宽度-1 | input | 14bit | 例如图像宽度为 1920则 i_w 应该置为 14d1919。需要在 i_sof=1 时保持有效。 |
| i_h | 图像高度-1 | input | 14bit | 例如图像宽度为 1080则 i_h 应该置为 14d1079。需要在 i_sof=1 时保持有效。 |
| i_e | 输入像素有效 | input | 1bit | 当 i_e=1 时,一个像素需要被输入到 i_x 上。 |
| i_x | 输入像素 | input | 8bit | 像素取值范围为 8'd0 ~ 8'd255 。 |
| o_e | 输出有效 | output | 1bit | 当 o_e=1 时,输出流数据产生在 o_data 上。 |
| o_data | 输出流数据 | output | 16bit | 大端序o_data[15:8] 在先o_data[7:0] 在后。 |
| o_last | 输出流末尾 | output | 1bit | 当 o_e=1 时若 o_last=1 ,说明这是一张图象的输出流的最后一个数据。 |
> 注i_w 不能小于 14'd4 。
## 输入图片
**jls_encoder 模块**的操作的流程是:
1. **复位**(可选):令 rstn=0 至少 **1 个周期**进行复位,之后正常工作时都保持 rstn=1。实际上也可以不复位即让 rstn 恒为1
2. **开始**:保持 i_sof=1 **至少 368 个周期**,同时在 i_w 和 i_h 信号上输入图像的宽度和高度i_sof=1 期间 i_w 和 i_h 要一直保持有效。
3. **输入**:控制 i_e 和 i_x从左到右从上到下地输入该图像的所有像素。当 i_e=1 时i_x 作为一个像素被输入。
4. **图像间空闲**:所有像素输入结束后,需要空闲**至少 16 个周期**不做任何动作(即 i_sof=0i_e=0。然后才能跳到第2步开始下一个图像。
i_sof=1 和 i_e=1 之间;以及 i_e=1 各自之间可以插入任意个空闲气泡(即, i_sof=0i_e=0这意味着我们可以断断续续地输入像素当然不插入任何气泡才能达到最高性能
下图展示了压缩 2 张图像的输入时序图(//代表省略若干周期X代表don't care。其中图像 1 在输入第一个像素后插入了 1 个气泡;而图像 2 在 i_sof=1 后插入了 1 个气泡。注意**图像间空闲**必须至少 **16 个周期**
__ __// __ __ __ __ //_ __ // __ __// __ __ __ // __
clk \__/ \__/ //_/ \__/ \__/ \__/ \__// \__/ \__///\__/ \__/ //_/ \__/ \__/ \__///\__/ \_
_______//________ // // _______//________ //
i_sof ____/ // \________________//___________//____/ // \___________//________
_______//________ // // _______//________ //
i_w XXXXX_______//________XXXXXXXXXXXXXXXXX//XXXXXXXXXXX//XXXXX_______//________XXXXXXXXXXXX//XXXXXXXX
_______//________ // // _______//________ //
i_h XXXXX_______//________XXXXXXXXXXXXXXXXX//XXXXXXXXXXX//XXXXX_______//________XXXXXXXXXXXX//XXXXXXXX
// _____ ____//_____ // // _____//____
i_e ____________//________/ \_____/ // \_____//____________//______________/ // \___
// _____ ____//_____ // // _____//____
i_x XXXXXXXXXXXX//XXXXXXXXX_____XXXXXXX____//_____XXXXXX//XXXXXXXXXXXX//XXXXXXXXXXXXXXX_____//____XXXX
阶段: | 开始图像1 | 输入图像1 | 图像间空闲 | 开始图像2 | 输入图像2
## 输出压缩流
在输入过程中,**jls_encoder** 同时会输出压缩好的 **JPEG-LS流**,该流构成了完整的 .jls 文件的内容包括文件头部和尾部。o_e=1 时o_data 是一个有效输出数据。其中o_data 遵循大端序,即 o_data[15:8] 在流中的位置靠前o_data[7:0] 在流中的位置靠后。在每个图像的输出流遇到最后一个数据时o_last=1 指示一张图像的压缩流结束。
# 仿真
仿真相关文件都在 SIM 目录里,包括:
* tb_jls_encoder.sv 是针对 jls_encoder 的 testbench。行为是将指定文件夹里的 .pgm 格式的未压缩图像批量送入 jls_encoder 进行压缩,然后将 jls_encoder 的输出结果保存到 .jls 文件里。
* tb_jls_encoder_run_iverilog.bat 包含了执行 iverilog 仿真的命令。
* images 文件夹包含几张 .pgm 格式的图像文件。 .pgm 格式存储的是未压缩(也就是存储原始像素)的 8bit 灰度图像,可以使用 photoshop 软件或 Linux 图像查看器就能打开它Windows图像查看器查看不了它
> .pgm 文件格式非常简单,只有一个文件头来指示图像的长宽,然后紧接着就存放图像的所有原始像素。因此我选用 .pgm 文件作为仿真的输入文件,因为只需要在 testbench 中简单地编写一些代码就能解析 .pgm 文件,并把其中的像素取出发给 jls_encoder 。不过,你可以不关注 pgm 文件的格式,因为 jls_encoder 的工作与 pgm 格式并没有关系,它只需要接受图像的原始像素作为输入即可。你只需关注仿真的波形,关注图像像素是如何被送入 jls_encoder 中即可。
使用 iverilog 进行仿真前,需要安装 iverilog ,见:[iverilog_usage](https://github.com/WangXuan95/WangXuan95/blob/main/iverilog_usage/iverilog_usage.md)
然后双击 tb_jls_encoder_run_iverilog.bat 就可以运行仿真,该仿真需要运行十几分钟。
仿真结束后,你可以看到文件夹中产生了几个 .jls 文件,它们就是压缩得到的图像文件。另外,仿真还产生了波形文件 dump.vcd ,你可以用 gtkwave 打开 dump.vcd 来查看波形。
另外,你还可以修改一些仿真参数来进行:
- 修改 tb_jls_encoder.sv 里的宏名 **NEAR** 来改变压缩率。
- 修改 tb_jls_encoder.sv 里的宏名 **BUBBLE_CONTROL** 来决定输入相邻的像素间插入多少个气泡:
- **BUBBLE_CONTROL=0** 时,不插入任何气泡。
- **BUBBLE_CONTROL>0** 时,插入 **BUBBLE_CONTROL **个气泡。
- **BUBBLE_CONTROL<0** 时,每次插入随机的 **0~(-BUBBLE_CONTROL)** 个气泡
> 在不同 NEAR 值和 BUBBLE_CONTROL 值下本库已经经过了几百张照片的结果对比验证充分保证无bug。这部分自动化验证代码就没放上来了
## 查看压缩结果
因为 **JPEG-LS** 比较小众和专业,大多数图片查看软件无法查看 .jls 文件。
你可以试试用[该网站](https://filext.com/file-extension/JLS)来查看 .jls 文件(不过这个网站时常失效)。
如果该网站失效,可以用我提供的解压器 decoder.exe 来把它解压回 .pgm 文件再查看。请在 SIM 目录下用 CMD 运行命令:
```powershell
.\decoder.exe <JLS_FILE_NAME> <PGM_FILE_NAME>
```
例如:
```powershell
.\decoder.exe test000.jls tmp.pgm
```
> 注decoder.exe 编译自 UBC 提供的 C 语言源码: http://www.stat.columbia.edu/~jakulin/jpeg-ls/mirror.htm
# FPGA 部署
在 Xilinx Artix-7 xc7a35tcsg324-2 上,综合和实现的结果如下。
| LUT | FF | BRAM | 最高时钟频率 |
| :--------: | :------: | :----------------------------: | :----------: |
| 2347 (11%) | 932 (2%) | 9个RAMB18 (9%),等效于 144Kbit | 35 MHz |
35MHz 下,图像压缩的性能为 35 Mpixel/s ,对 1920x1080 图像的压缩帧率是 16.8fps 。
<span id="en">FPGA JPEG-LS image compressor</span>
===========================
**FPGA** based streaming **JPEG-LS** image compressor, features:
* For compressing **8bit** grayscale images.
* Support **lossless mode**, i.e. NEAR=0 .
* Support **lossy mode**, NEAR=1~7 adjustable.
* The value range of image width is [5,16384], and the value range of height is [1,16384].
* Minimalist streaming input and output.
# Background
**JPEG-LS** (abbreviated as **JLS**) is a lossless/lossy image compression algorithm which has the best lossless compression ratio compared to JPEG2000 and JPEG-XR. **JPEG-LS** uses the maximum difference between the pixels before and after compression (**NEAR** value) to control distortion, **NEAR=0** is the lossless mode; **NEAR>0** is the lossy mode, the larger the **NEAR**, the greater the distortion and the greater the compression ratio. The file suffix name for **JPEG-LS** compressed image is .**jls** .
# Module Usage
[**jls_encoder.sv**](./RTL/jls_encoder.sv) in the [RTL](./RTL) directory is a JPEG-LS compression module that can be call by the FPGA users, which inputs image raw pixels and outputs a JPEG-LS compressed stream.
## Module parameter
**jls_encoder** has a parameter:
```verilog
parameter logic [2:0] NEAR
```
which determines the NEAR value of JPEG-LS algorithm. When the value is 3'd0, it works in lossless mode; when the value is 3'd1~3'd7, it works in lossy mode.
## Module Interface
The input and output signals of **jls_encoder** are described in the following table.
| Signal | Name | direction | width | description |
| :----: | :------------: | :-------: | :---: | :----------------------------------------------------------- |
| rstn | reset | in | 1bit | When the clock rises, if rstn=0, the module is reset, and rstn=1 in normal use. |
| clk | clock | in | 1bit | All signals should be aligned on the rising edge of clk. |
| i_sof | start of frame | in | 1bit | When a new image needs to be input, keep i_sof=1 for at least 368 clock cycles. |
| i_w | width-1 | in | 14bit | For example, if the image width is 1920, i_w should be set to 14'd1919. Needs to remain valid when i_sof=1. |
| i_h | height-1 | in | 14bit | For example, if the image width is 1080, i_h should be set to 14'd1079. Needs to remain valid when i_sof=1. |
| i_e | input valid | in | 1bit | i_e=1 indicates a valid input pixel is on i_x |
| i_x | input pixel | in | 8bit | The pixel value range is 8'd0 ~ 8'd255 . |
| o_e | output valid | out | 1bit | o_e=1 indicates a valid data is on o_data. |
| o_data | output data | out | 16bit | Big endian, odata[15:8] online; odata[7:0] after. |
| o_last | output last | out | 1bit | o_last=1, indicate that this is the last data of the output stream of an image. |
> Notei_w cannot less than 14'd4 。
## Input pixels
The operation flow of **jls_encoder** module is:
1. **Reset** (optional): Set `rstn=0` for at least **1 cycle** to reset, and then keep `rstn=1` during normal operation. In fact, it is not necessary to reset.
2. **Start**: keep `i_sof=1` **at least 368 cycles**, while inputting the width and height of the image on the `i_w` and `i_h` signals, `i_w` and `i_h` should remain valid during` i_sof=1`.
3. **Input**: Control `i_e` and `i_x`, input all the pixels of the image from left to right, top to bottom. When `i_e=1`, `i_x` is input as a pixel.
4. **Idle between images**: After all pixel input ends, it needs to be idle for at least 16 cycles without any action (i.e. `i_sof=0`, `i_e=0`). Then you can skip to step 2 and start the next image.
Between `i_sof=1` and `i_e=1`; and between `i_e=1` each can insert any number of free bubbles (ie, `i_sof=0`, `i_e=0`), which means that we can input pixels intermittently (of course, without inserting any bubbles for maximum performance).
The following figure shows the input timing diagram of compressing 2 images (//represents omitting several cycles, X represents don't care). where image 1 has 1 bubble inserted after the first pixel is entered; while image 2 has 1 bubble inserted after i_sof=1. Note **Inter-image idle** must be at least **16 cycles**.
__ __// __ __ __ __ //_ __ // __ __// __ __ __ // __
clk \__/ \__/ //_/ \__/ \__/ \__/ \__// \__/ \__///\__/ \__/ //_/ \__/ \__/ \__///\__/ \_
_______//________ // // _______//________ //
i_sof ____/ // \________________//___________//____/ // \___________//________
_______//________ // // _______//________ //
i_w XXXXX_______//________XXXXXXXXXXXXXXXXX//XXXXXXXXXXX//XXXXX_______//________XXXXXXXXXXXX//XXXXXXXX
_______//________ // // _______//________ //
i_h XXXXX_______//________XXXXXXXXXXXXXXXXX//XXXXXXXXXXX//XXXXX_______//________XXXXXXXXXXXX//XXXXXXXX
// _____ ____//_____ // // _____//____
i_e ____________//________/ \_____/ // \_____//____________//______________/ // \___
// _____ ____//_____ // // _____//____
i_x XXXXXXXXXXXX//XXXXXXXXX_____XXXXXXX____//_____XXXXXX//XXXXXXXXXXXX//XXXXXXXXXXXXXXX_____//____XXXX
阶段: | 开始图像1 | 输入图像1 | 图像间空闲 | 开始图像2 | 输入图像2
## Output JLS stream
During the input, **jls_encoder** will also output a compressed **JPEG-LS stream**, which constitutes the content of the complete .jls file (including the file header and trailer). When `o_e=1`, `o_data` is a valid output data. Among them, `o_data` follows the big endian order, that is, `o_data[15:8]` is at the front of the stream, and `o_data[7:0]` is at the back of the stream. `o_last=1` indicates the end of the compressed stream for an image when the output stream for each image encounters the last data.
# RTL Simulation
Simulation related files are in the [SIM](./SIM) directory, including:
* [tb_jls_encoder.sv](./SIM) is a testbench for jls_encoder. The behavior is: batch uncompressed images in .pgm format in the specified folder into jls_encoder for compression, and then save the output of jls_encoder to a .jls file.
* [tb_jls_encoder_run_iverilog.bat](./SIM) is a command script for iverilog simulation.
* The [images](./SIM) folder contains several image files in .pgm format. The .pgm format stores an uncompressed (that is, raw pixel) 8bit grayscale image, which can be opened with photoshop software or a Linux image viewer (Windows image viewer cannot view it).
> The .pgm file format is very simple, with only a header to indicate the length and width of the image, followed by all the raw pixels of the image. So I choose .pgm file as the input file for the simulation, because it only needs to write some code in the testbench to parse the .pgm file, and take out the pixels and send it to jls_encoder . However, you can ignore the format of the pgm file, because the work of jls_encoder has nothing to do with the pgm format, it only needs to accept the raw pixels of the image as input. You only need to focus on the simulated waveform and how the image pixels are fed into the jls_encoder.
Before using iverilog for simulation, you need to install iverilog , see: [iverilog_usage](https://github.com/WangXuan95/WangXuan95/blob/main/iverilog_usage/iverilog_usage.md)
Then double-click tb_jls_encoder_run_iverilog.bat to run the simulation, which takes more than 10 minutes to run.
After the simulation is over, you can see that several .jls files are generated in the folder, which are compressed image files. In addition, the simulation also produces a waveform file dump.vcd, you can open dump.vcd with gtkwave to view the waveform.
In addition, you can also modify some simulation parameters:
- Modify the macro **NEAR** in tb_jls_encoder.sv to change the compression ratio.
- Modify the macro **BUBBLE_CONTROL** in tb_jls_encoder.sv to determine how many bubbles to insert between adjacent input pixels:
- When **BUBBLE_CONTROL=0**, no bubbles are inserted.
- When **BUBBLE_CONTROL>0**, insert **BUBBLE_CONTROL ** bubbles.
- When **BUBBLE_CONTROL<0**, insert random **0~(-BUBBLE_CONTROL)** bubbles each time.
## View compressed JLS file
Because **JPEG-LS** is niche and professional, most image viewing software cannot view .jls files.
You can try [this site](https://filext.com/file-extension/JLS) to view .jls files (though this site doesn't work sometimes).
If the website doesn't work, you can use the decompressor [decoder.exe](./SIM) I provided to decompress it back to a .pgm file and view it again. Please run the command with CMD in the [SIM](./SIM) directory:
```powershell
.\decoder.exe <JLS_FILE_NAME> <PGM_FILE_NAME>
```
For example:
```powershell
.\decoder.exe test000.jls tmp.pgm
```
> Note: decoder.exe is compiled from the C language source code provided by UBC : http://www.stat.columbia.edu/~jakulin/jpeg-ls/mirror.htm
# FPGA Deployment
On Xilinx Artix-7 xc7a35tcsg324-2, the synthesized and implemented results are as follows.
| LUT | FF | BRAM | Max Clock freq. |
| :--------: | :------: | :----------------------------: | :-------------: |
| 2347 (11%) | 932 (2%) | 9 x RAMB18 (9%), total 144Kbit | 35 MHz |
At 35MHz, the image compression performance is 35 Mpixel/s, which means the compression frame rate for 1920x1080 images is 16.8fps.

903
RTL/jls_encoder.sv Normal file
View File

@ -0,0 +1,903 @@
//--------------------------------------------------------------------------------------------------------
// Module : jls_encoder
// Type : synthesizable, IP's top
// Standard: SystemVerilog 2005 (IEEE1800-2005)
// Function: JPEG-LS image compressor
//--------------------------------------------------------------------------------------------------------
module jls_encoder #(
parameter [2:0] NEAR = 3'd1
) (
input wire rstn,
input wire clk,
input wire i_sof, // start of image
input wire [13:0] i_w, // image_width-1 , range: 4~16383, that is, image_width range: 5~16384
input wire [13:0] i_h, // image_height-1, range: 0~16382, that is, image_height range: 1~16383
input wire i_e, // input pixel enable
input wire [ 7:0] i_x, // input pixel
output wire o_e, // output data enable
output wire [15:0] o_data, // output data
output wire o_last // indicate the last output data of a image
);
//---------------------------------------------------------------------------------------------------------------------------
// local parameters
//---------------------------------------------------------------------------------------------------------------------------
wire [3:0] P_QBPPS [8];
assign P_QBPPS[0] = 4'd8;
assign P_QBPPS[1] = 4'd7;
assign P_QBPPS[2] = 4'd6;
assign P_QBPPS[3] = 4'd6;
assign P_QBPPS[4] = 4'd5;
assign P_QBPPS[5] = 4'd5;
assign P_QBPPS[6] = 4'd5;
assign P_QBPPS[7] = 4'd5;
localparam logic P_LOSSY = NEAR != '0;
localparam logic signed [8:0] P_NEAR = $signed({6'd0, NEAR});
localparam logic signed [8:0] P_T1 = $signed(9'd3) + $signed(9'd3) * P_NEAR;
localparam logic signed [8:0] P_T2 = $signed(9'd7) + $signed(9'd5) * P_NEAR;
localparam logic signed [8:0] P_T3 = $signed(9'd21)+ $signed(9'd7) * P_NEAR;
localparam logic signed [9:0] P_QUANT = {P_NEAR, 1'b1};
localparam logic signed [9:0] P_QBETA = $signed(10'd256 + {5'd0,NEAR,2'd0}) / P_QUANT;
localparam logic signed [9:0] P_QBETAHALF = (P_QBETA+$signed(10'd1)) / $signed(10'd2);
wire [3:0] P_QBPP = P_QBPPS[NEAR];
wire [4:0] P_LIMIT = 5'd31 - {1'b0, P_QBPP};
localparam logic [12:0] P_AINIT = (NEAR=='0) ? 13'd4 : 13'd2;
wire [3:0] J [32];
assign J[ 0] = 4'd0;
assign J[ 1] = 4'd0;
assign J[ 2] = 4'd0;
assign J[ 3] = 4'd0;
assign J[ 4] = 4'd1;
assign J[ 5] = 4'd1;
assign J[ 6] = 4'd1;
assign J[ 7] = 4'd1;
assign J[ 8] = 4'd2;
assign J[ 9] = 4'd2;
assign J[10] = 4'd2;
assign J[11] = 4'd2;
assign J[12] = 4'd3;
assign J[13] = 4'd3;
assign J[14] = 4'd3;
assign J[15] = 4'd3;
assign J[16] = 4'd4;
assign J[17] = 4'd4;
assign J[18] = 4'd5;
assign J[19] = 4'd5;
assign J[20] = 4'd6;
assign J[21] = 4'd6;
assign J[22] = 4'd7;
assign J[23] = 4'd7;
assign J[24] = 4'd8;
assign J[25] = 4'd9;
assign J[26] = 4'd10;
assign J[27] = 4'd11;
assign J[28] = 4'd12;
assign J[29] = 4'd13;
assign J[30] = 4'd14;
assign J[31] = 4'd15;
//---------------------------------------------------------------------------------------------------------------------------
// function: is_near
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic func_is_near(input [7:0] x1, input [7:0] x2);
logic signed [8:0] ex1, ex2;
ex1 = $signed({1'b0,x1});
ex2 = $signed({1'b0,x2});
return ex1 - ex2 <= P_NEAR && ex2 - ex1 <= P_NEAR;
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: predictor (get_px)
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [7:0] func_predictor(input [7:0] a, input [7:0] b, input [7:0] c);
if( c>=a && c>=b )
return a>b ? b : a;
else if( c<=a && c<=b )
return a>b ? a : b;
else
return a - c + b;
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: q_quantize
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic signed [3:0] func_q_quantize(input [7:0] x1, input [7:0] x2);
logic signed [8:0] delta;
delta = $signed({1'b0,x1}) - $signed({1'b0,x2});
if (delta <= -P_T3 )
return -$signed(4'd4);
else if(delta <= -P_T2 )
return -$signed(4'd3);
else if(delta <= -P_T1 )
return -$signed(4'd2);
else if(delta < -P_NEAR )
return -$signed(4'd1);
else if(delta <= P_NEAR )
return $signed(4'd0);
else if(delta < P_T1 )
return $signed(4'd1);
else if(delta < P_T2 )
return $signed(4'd2);
else if(delta < P_T3 )
return $signed(4'd3);
else
return $signed(4'd4);
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: get_q (part 1), qp1 = 81*Q(d-b) + 9*Q(b-c)
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic signed [9:0] func_get_qp1(input [7:0] c, input [7:0] b, input [7:0] d);
return $signed(10'd81) * func_q_quantize(d,b) + $signed(10'd9) * func_q_quantize(b,c);
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: get_q (part 2), get sign(qs) and abs(qs), where qs = qp1 + Q(c-a)
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [9:0] func_get_q(input signed [9:0] qp1, input [7:0] c, input [7:0] a);
logic signed [9:0] qs;
logic s;
logic [8:0] q;
qs = qp1 + func_q_quantize(c,a);
s = qs[9];
q = s ? (~qs[8:0]+9'd1) : qs[8:0];
return {s, q};
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: clip
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [7:0] func_clip(input signed [9:0] val);
if( val > $signed(10'd255) )
return 8'd255;
else if( val < $signed(10'd0) )
return 8'd0;
else
return val[7:0];
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: errval_quantize
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic signed [9:0] func_errval_quantize(input signed [9:0] err);
if(err[9])
return -( (P_NEAR - err) / P_QUANT );
else
return (P_NEAR + err) / P_QUANT;
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: modrange
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic signed [9:0] func_modrange(input signed [9:0] val);
logic signed [9:0] new_val;
new_val = val;
if( new_val[9] )
new_val += P_QBETA;
if( new_val >= P_QBETAHALF )
new_val -= P_QBETA;
return new_val;
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: get k
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [3:0] func_get_k(input [12:0] A, input [6:0] N, input rt);
logic [18:0] Nt, At;
logic [ 3:0] k;
Nt = {12'h0, N};
At = { 6'h0, A};
k = 4'd0;
if(rt)
At += {13'd0, N[6:1]};
for(int ii=0; ii<13; ii++)
if((Nt<<ii) < At)
k++;
return k;
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: B update for run mode
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [6:0] B_update(input reset, input [6:0] B, input errm0);
B_update = B;
if(errm0)
B_update ++;
if(reset)
B_update >>>= 1;
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: C, B update for regular mode
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [14:0] C_B_update(input reset, input [6:0] N, input signed [7:0] C, input signed [6:0] B, input signed [9:0] err);
logic signed [9:0] Bt;
logic signed [7:0] Ct;
Bt = B;
Ct = C;
Bt += err * P_QUANT;
if(reset)
Bt >>>= 1;
if( Bt <= -$signed({3'd0,N}) ) begin
Bt += $signed({3'd0,N});
if( Bt <= -$signed({3'd0,N}) )
Bt = -$signed({3'd0,N}-10'd1);
if( Ct != $signed(8'd128) )
Ct--;
end else if( Bt > $signed(10'd0) ) begin
Bt -= $signed({3'd0,N});
if( Bt > $signed(10'd0) )
Bt = $signed(10'd0);
if( Ct != $signed(8'd127) )
Ct++;
end
return {Ct, Bt[6:0]};
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: A update
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [12:0] A_update(input reset, input [12:0] A, input [9:0] inc);
A_update = A + {3'd0, inc};
if(reset)
A_update >>>= 1;
endfunction
//-------------------------------------------------------------------------------------------------------------------
// context memorys
//-------------------------------------------------------------------------------------------------------------------
reg [ 5:0] Nram [366];
reg [12:0] Aram [366];
reg signed [ 6:0] Bram [366];
reg signed [ 7:0] Cram [1:364];
//-------------------------------------------------------------------------------------------------------------------
// pipeline stage a: generate ii, jj
//-------------------------------------------------------------------------------------------------------------------
reg a_sof;
reg a_e;
reg [ 7:0] a_x;
reg [13:0] a_w;
reg [13:0] a_h;
reg [13:0] a_wl;
reg [13:0] a_hl;
reg [13:0] a_ii;
reg [14:0] a_jj;
always @ (posedge clk)
if(~rstn) begin
{a_sof, a_e, a_x, a_w, a_h, a_wl, a_hl, a_ii, a_jj} <= '0;
end else begin
a_sof <= i_sof;
a_e <= i_e;
a_x <= i_x;
a_w <= i_w;
a_h <= i_h;
if(a_sof) begin
a_wl <= (a_w<14'd4 ? 14'd4 : a_w);
a_hl <= a_h;
a_ii <= '0;
a_jj <= '0;
end else if(a_e) begin
if(a_ii < a_wl)
a_ii <= a_ii + 14'd1;
else begin
a_ii <= '0;
if(a_jj <= {1'b0,a_hl})
a_jj <= a_jj + 15'd1;
end
end
end
//-------------------------------------------------------------------------------------------------------------------
// pipeline stage b: generate fc, lc, nfr
//-------------------------------------------------------------------------------------------------------------------
reg b_sof;
reg b_e;
reg b_fc;
reg b_lc;
reg b_fr;
reg b_eof;
reg [13:0] b_ii;
reg [ 7:0] b_x;
always @ (posedge clk) begin
b_sof <= a_sof & rstn;
if(~rstn | a_sof) begin
{b_e, b_fc, b_lc, b_fr, b_eof, b_ii, b_x} <= '0;
end else begin
b_e <= a_e & (a_jj <= {1'b0,a_hl});
b_fc <= a_e & (a_ii == '0);
b_lc <= a_e & (a_ii == a_wl);
b_fr <= a_e & (a_jj == '0);
b_eof <= a_jj > {1'b0,a_hl};
b_ii <= a_ii;
b_x <= a_x;
end
end
//-------------------------------------------------------------------------------------------------------------------
// pipeline stage c: maintain linebuffer, generate context pixels: b, c, d , where d is not valid in case of fr and lc
//-------------------------------------------------------------------------------------------------------------------
reg c_sof;
reg c_e;
reg c_fc;
reg c_lc;
reg c_fr;
reg c_eof;
reg [13:0] c_ii;
reg [ 7:0] c_x;
reg [ 7:0] c_b;
reg [ 7:0] c_bt;
reg [ 7:0] c_c;
reg [ 7:0] c_d;
always @ (posedge clk) begin
c_sof <= b_sof & rstn;
if(~rstn | b_sof) begin
{c_e,c_fc,c_lc,c_fr,c_eof,c_ii,c_x,c_b,c_bt,c_c} <= '0;
end else begin
c_e <= b_e;
c_fc <= b_fc;
c_lc <= b_lc;
c_fr <= b_fr;
c_eof <= b_eof;
c_ii <= b_ii;
if(b_e) begin
c_x <= b_x;
c_b <= b_fr ? '0 : c_d;
if(b_fr) begin
c_bt <= '0;
c_c <= '0;
end else if(b_fc) begin
c_bt <= c_d;
c_c <= c_bt;
end else
c_c <= c_b;
end
end
end
//-------------------------------------------------------------------------------------------------------------------
// pipeline stage d: fix context pixel d (locally) in case fr and lc, get q part-1 (qp1)
//-------------------------------------------------------------------------------------------------------------------
reg d_sof;
reg d_e;
reg d_fc;
reg d_lc;
reg d_eof;
reg [13:0] d_ii;
reg [ 7:0] d_x;
reg [ 7:0] d_b;
reg [ 7:0] d_c;
reg signed [9:0] d_qp1;
always @ (posedge clk) begin
d_sof <= c_sof & rstn;
if(~rstn | c_sof) begin
{d_e, d_fc, d_lc, d_eof, d_ii, d_x, d_b, d_c, d_qp1} <= '0;
end else begin
logic [7:0] d;
d_e <= c_e;
d_fc <= c_fc;
d_lc <= c_lc;
d_eof <= c_eof;
d_ii <= c_ii;
d_x <= c_x;
d_b <= c_b;
d_c <= c_c;
d = c_fr ? '0 : (c_lc ? c_b : c_d);
d_qp1 <= func_get_qp1(c_c, c_b, d);
end
end
//-------------------------------------------------------------------------------------------------------------------
// pipeline stage e: get errval, Rx reconstruct loop, N, B, C update
//-------------------------------------------------------------------------------------------------------------------
reg e_sof;
reg e_e;
reg e_fc;
reg e_lc;
reg e_eof;
reg [13:0] e_ii;
reg e_runi;
reg e_rune;
reg e_2BleN;
reg [7:0] e_x;
reg [8:0] e_q;
reg e_rt;
reg signed [9:0] e_err;
reg [6:0] e_No;
reg e_write_C, e_write_en;
reg [5:0] e_Nn;
reg signed [7:0] e_Cn;
reg signed [6:0] e_Bn;
always @ (posedge clk) begin
e_sof <= d_sof & rstn;
e_2BleN <= 1'b0;
{e_write_C, e_Cn, e_write_en, e_Bn, e_Nn} <= '0;
if(~rstn | d_sof) begin
{e_e, e_fc, e_lc, e_eof, e_ii, e_runi, e_rune, e_x, e_q, e_rt, e_err, e_No} <= '0;
end else begin
logic [7:0] a;
logic s;
logic [8:0] q;
logic rt;
logic runi;
logic rune;
logic signed [7:0] Co;
logic [6:0] No, Nn;
logic signed [6:0] Bo;
logic signed [9:0] px;
logic signed [9:0] err;
a = d_fc ? d_b : e_x;
rt = 1'b0;
rune = 1'b0;
No = '0;
err = '0;
{s, q} = func_get_q(d_qp1, d_c, a);
Co = (e_write_C & e_q==q) ? e_Cn : Cram[q];
runi = ~d_fc & e_runi | (q == 9'd0);
if(runi) begin
runi = func_is_near(d_x, a);
rune = ~runi;
end
if(d_e) begin
if(runi) begin
e_x <= P_LOSSY ? a : d_x;
end else begin
if(rune) begin
rt = func_is_near(d_b, a);
s = {1'b0,a} > ({1'b0,d_b} + {6'd0,NEAR}) ? 1'b1 : 1'b0;
q = rt ? 9'd365 : 9'd0;
px = rt ? a : d_b;
end else begin
px[9:8] = 2'b00;
px[7:0] = func_clip( $signed({2'h0, func_predictor(a,d_b,d_c)}) + ( s ? -$signed({Co[7],Co[7],Co}) : $signed({Co[7],Co[7],Co}) ) );
end
err = s ? px - $signed({2'd0, d_x}) : $signed({2'd0, d_x}) - px;
err = func_errval_quantize(err);
e_x <= P_LOSSY ? func_clip( px + ( s ? -(P_QUANT*err) : P_QUANT*err ) ) : d_x;
err = func_modrange(err);
No = ((e_write_en & e_q==q) ? e_Nn : Nram[q]) + 7'd1;
Nn = No;
if(No[6]) Nn >>>= 1;
e_Nn <= Nn[5:0];
Bo = (e_write_en & e_q==q) ? e_Bn : Bram[q];
e_write_en <= 1'b1;
if(rune) begin
e_Bn <= B_update(No[6], Bo, err<$signed(10'd0));
e_2BleN <= $signed({Bo,1'b0}) < $signed({1'b0,No});
end else begin
e_write_C <= 1'b1;
{e_Cn, e_Bn} <= C_B_update(No[6], Nn+7'd1, Co, Bo, err);
e_2BleN <= $signed({Bo,1'b0}) <= -$signed({1'b0,No});
end
end
e_runi <= runi;
end
e_e <= d_e;
e_fc <= d_fc;
e_lc <= d_lc;
e_eof <= d_eof;
e_ii <= d_ii;
e_rune <= d_e & rune;
e_q <= q;
e_rt <= rt;
e_err <= err;
e_No <= No;
end
end
//-------------------------------------------------------------------------------------------------------------------
// pipeline stage f: write Cram, Bram, Nram
//-------------------------------------------------------------------------------------------------------------------
reg [8:0] NBC_init_addr;
always @ (posedge clk)
NBC_init_addr <= e_sof ? NBC_init_addr + (NBC_init_addr < 9'd366 ? 9'd1 : 9'd0) : 9'd0;
always @ (posedge clk)
if(e_sof | e_write_en) begin
Nram[e_write_en ? e_q : NBC_init_addr] <= e_Nn;
Bram[e_write_en ? e_q : NBC_init_addr] <= e_Bn;
end
always @ (posedge clk)
if(e_sof | e_write_C) begin
Cram[e_write_C ? e_q : NBC_init_addr] <= e_Cn;
end
//-------------------------------------------------------------------------------------------------------------------
// pipeline stage ef: read Aram, buffer registers
//-------------------------------------------------------------------------------------------------------------------
reg ef_sof;
reg ef_e;
reg ef_fc;
reg ef_lc;
reg ef_eof;
reg ef_runi;
reg ef_rune;
reg ef_2BleN;
reg [8:0] ef_q;
reg ef_rt;
reg signed [9:0] ef_err;
reg [6:0] ef_No;
reg [12:0] ef_Ao;
reg ef_write_en;
always @ (posedge clk) begin
ef_sof <= e_sof & rstn;
if(~rstn | e_sof) begin
{ef_e, ef_fc, ef_lc, ef_eof, ef_runi, ef_rune, ef_2BleN, ef_q, ef_rt, ef_err, ef_No, ef_write_en} <= '0;
end else begin
ef_e <= e_e;
ef_fc <= e_fc;
ef_lc <= e_lc;
ef_eof <= e_eof;
ef_runi <= e_runi & e_e;
ef_rune <= e_rune;
ef_2BleN <= e_2BleN;
ef_q <= e_q;
ef_rt <= e_rt;
ef_err <= e_err;
ef_No <= e_No;
ef_write_en <= e_write_en;
end
end
always @ (posedge clk)
ef_Ao <= Aram[e_q];
//-------------------------------------------------------------------------------------------------------------------
// pipeline stage f: process run, calcuate merrval and k, A update
//-------------------------------------------------------------------------------------------------------------------
reg f_sof;
reg f_e;
reg f_eof;
reg f_runi;
reg f_rune;
reg [ 9:0] f_merr;
reg [ 3:0] f_k;
reg [15:0] f_rc;
reg [ 4:0] f_ri;
reg [ 1:0] f_on;
reg [15:0] f_cb;
reg [ 4:0] f_cn; // in range of 0~16
reg [ 4:0] f_limit;
reg [8:0] f_q;
reg f_write_en;
reg [12:0] f_An;
reg [8:0] g_q;
reg g_write_en;
reg [12:0] g_An;
always @ (posedge clk)
if(~rstn | f_sof)
{g_q, g_write_en, g_An} <= '0;
else
{g_q, g_write_en, g_An} <= {f_q, f_write_en, f_An};
always @ (posedge clk) begin
f_sof <= ef_sof & rstn;
f_limit <= P_LIMIT;
f_An <= P_AINIT;
if(~rstn | ef_sof) begin
{f_e, f_eof, f_runi, f_rune, f_merr, f_k, f_rc, f_ri, f_on, f_cb, f_cn, f_q, f_write_en} <= '0;
end else begin
logic [ 1:0] on;
logic [15:0] rc;
logic [ 4:0] ri;
logic [12:0] Ao;
logic [ 3:0] k;
logic [ 9:0] abserr;
logic [ 9:0] merr, Ainc;
logic map;
on = '0;
rc = (ef_fc|~ef_runi) ? '0 : f_rc;
ri = f_ri;
Ao = (f_write_en & f_q==ef_q) ? f_An : (g_write_en & g_q==ef_q) ? g_An : ef_Ao;
abserr = ef_err<$signed(10'd0) ? $unsigned(-ef_err) : $unsigned(ef_err);
merr='0;
Ainc='0;
f_write_en <= ef_write_en;
k = func_get_k(Ao, ef_No, ef_rt);
f_cb <= ef_fc ? '0 : f_rc;
f_cn <= {1'b0,J[ri]} + 5'd1;
if(ef_runi) begin
rc ++;
if(rc >= (16'd1<<J[ri])) begin
on++;
rc -= (16'd1<<J[ri]);
if(ri < 5'd31) ri ++;
end
if(ef_lc & (rc > 16'd0))
on++;
end else if(ef_rune) begin
f_limit <= P_LIMIT - 5'd1 - {1'b0,J[ri]};
if(ri > '0) ri --;
map = ~( (ef_err=='0) | ( (ef_err>$signed(10'd0)) ^ (k==4'd0 & ef_2BleN) ) );
merr = (abserr<<1) - {9'd0,ef_rt} - {9'd0,map};
Ainc = ((merr + {9'd0,~ef_rt}) >> 1);
end else begin
map = (~P_LOSSY) & (k==4'd0) & ef_2BleN;
if(ef_err < $signed(10'd0))
merr = (abserr<<1) - 10'd1 - {9'd0,map};
else
merr = (abserr<<1) + {9'd0,map};
Ainc = (ef_err < $signed(10'd0)) ? $unsigned(-ef_err) : $unsigned(ef_err);
end
if(ef_e) begin
f_rc <= rc;
f_ri <= ri;
end
f_An <= A_update(ef_No[6], Ao, Ainc);
f_e <= ef_e;
f_eof <= ef_eof;
f_runi <= ef_runi;
f_rune <= ef_rune;
f_merr <= merr;
f_k <= k;
f_on <= on;
f_q <= ef_q;
end
end
//-------------------------------------------------------------------------------------------------------------------
// pipeline stage g: write Aram
//-------------------------------------------------------------------------------------------------------------------
reg [8:0] A_init_addr;
always @ (posedge clk)
A_init_addr <= f_sof ? A_init_addr + (A_init_addr < 9'd366 ? 9'd1 : 9'd0) : 9'd0;
always @ (posedge clk)
if(f_sof | f_write_en)
Aram[f_write_en ? f_q : A_init_addr] <= f_An;
//-------------------------------------------------------------------------------------------------------------------
// pipeline stage g: golomb coding parser
//-------------------------------------------------------------------------------------------------------------------
reg g_sof;
reg g_e;
reg g_eof;
reg g_runi;
reg [ 1:0] g_on; // in range of 0~2
reg [15:0] g_cb;
reg [ 4:0] g_cn; // in range of 0~16
reg [ 4:0] g_zn; // in range of 0~27
reg [ 9:0] g_db;
reg [ 3:0] g_dn; // in range of 0~13
always @ (posedge clk) begin
g_sof <= f_sof & rstn;
if(~rstn | f_sof) begin
{g_e, g_eof, g_runi, g_on, g_cb, g_cn, g_zn, g_db, g_dn} <= '0;
end else begin
logic [9:0] merr_sk;
merr_sk = f_merr >> f_k;
g_e <= f_e;
g_eof <= f_eof;
g_runi <= f_runi;
g_on <= f_on;
g_cb <= f_rune ? f_cb : '0;
g_cn <= f_rune ? f_cn : '0;
if(merr_sk < f_limit) begin
g_zn <= merr_sk[4:0];
g_db <= f_merr & ~(10'h3ff<<f_k);
g_dn <= f_k;
end else begin
g_zn <= f_limit[4:0];
g_db <= (f_merr-10'd1) & ~(10'h3ff<<P_QBPP);
g_dn <= P_QBPP;
end
end
end
//-------------------------------------------------------------------------------------------------------------------
// pipeline stage h: golomb coding bits merge
//-------------------------------------------------------------------------------------------------------------------
reg h_sof;
reg h_eof;
reg [56:0] h_bb; // max 57 bits
reg [ 5:0] h_bn; // in range of 0~57
always @ (posedge clk) begin
h_sof <= g_sof & rstn;
{h_bb, h_bn} <= '0;
if(~rstn | g_sof) begin
h_eof <= 1'b0;
end else begin
h_eof <= g_eof;
if(g_e) begin
if(g_runi) begin
if(g_on==2'd1)
h_bb[56] <= 1'b1;
else if(g_on==2'd2)
h_bb[56:55] <= 2'b11;
h_bn <= {4'h0, g_on};
end else begin
h_bb <= ( {41'h0,g_cb} << (6'd57-g_cn) ) | ( 57'd1 << (6'd56-g_cn-g_zn) ) | ( {47'h0,g_db} << (6'd56-g_cn-g_zn-g_dn) );
h_bn <= {1'b0,g_cn} + {1'b0,g_zn} + 6'd1 + {2'b0,g_dn};
end
end
end
end
//-------------------------------------------------------------------------------------------------------------------
// pipeline stage j: jls stream generate
//-------------------------------------------------------------------------------------------------------------------
reg j_sof;
reg j_eof;
reg j_e;
reg [15:0] j_data;
reg[247:0] j_bbuf;
reg [ 7:0] j_bcnt;
always @ (posedge clk) begin
j_sof <= h_sof & rstn;
{j_e, j_data} <= '0;
if(~rstn | h_sof) begin
{j_eof, j_bbuf, j_bcnt} <= '0;
end else begin
logic [247:0] bbuf;
logic [ 7:0] bcnt;
bbuf = j_bbuf | ({h_bb,191'h0} >> j_bcnt);
bcnt = j_bcnt + {2'd0,h_bn};
if(bcnt >= 8'd16) begin
j_e <= 1'b1;
j_data[15:8] <= bbuf[247:240];
if(bbuf[247:240] == '1) begin
bbuf = {1'h0, bbuf[239:0], 7'h0};
bcnt -= 8'd7;
end else begin
bbuf = { bbuf[239:0], 8'h0};
bcnt -= 8'd8;
end
j_data[ 7:0] <= bbuf[247:240];
if(bbuf[247:240] == '1) begin
bbuf = {1'h0, bbuf[239:0], 7'h0};
bcnt -= 8'd7;
end else begin
bbuf = { bbuf[239:0], 8'h0};
bcnt -= 8'd8;
end
end else if(h_eof && bcnt > 8'd0) begin
j_e <= 1'b1;
j_data[15:8] <= bbuf[247:240];
if(bbuf[247:240] == '1)
j_data[ 7:0] <= {1'b0,bbuf[239:233]};
else
j_data[ 7:0] <= bbuf[239:232];
bbuf = '0;
bcnt = 8'd0;
end
j_bbuf <= bbuf;
j_bcnt <= bcnt;
j_eof <= h_eof;
end
end
//-------------------------------------------------------------------------------------------------------------------
// make .jls file header and footer
//-------------------------------------------------------------------------------------------------------------------
reg [15:0] jls_wl, jls_hl;
wire[15:0] jls_header [13];
assign jls_header[0] = 16'hFFD8;
assign jls_header[1] = 16'h00FF;
assign jls_header[2] = 16'hF700;
assign jls_header[3] = 16'h0B08;
assign jls_header[4] = jls_hl;
assign jls_header[5] = jls_wl;
assign jls_header[6] = 16'h0101;
assign jls_header[7] = 16'h1100;
assign jls_header[8] = 16'hFFDA;
assign jls_header[9] = 16'h0008;
assign jls_header[10]= 16'h0101;
assign jls_header[11]= {13'b0,NEAR};
assign jls_header[12]= 16'h0000;
wire[15:0] jls_footer = 16'hFFD9;
always @ (posedge clk)
if(~rstn) begin
jls_wl <= '0;
jls_hl <= '0;
end else begin
jls_wl <= {2'd0,a_wl} + 16'd1;
jls_hl <= {2'd0,a_hl} + 16'd1;
end
//-------------------------------------------------------------------------------------------------------------------
// pipeline stage k: add .jls file header and footer
//-------------------------------------------------------------------------------------------------------------------
reg [3:0] k_header_i;
reg k_footer_i;
reg k_last;
reg k_e;
reg [15:0] k_data;
always @ (posedge clk) begin
k_last <= 1'b0;
k_e <= 1'b0;
k_data <= '0;
if(j_sof) begin
k_footer_i <= '0;
if(k_header_i < 4'd13) begin
k_e <= 1'b1;
k_data <= jls_header[k_header_i];
k_header_i <= k_header_i + 4'd1;
end
end else if(j_e) begin
k_header_i <= '0;
k_footer_i <= '0;
k_e <= 1'b1;
k_data <= j_data;
end else if(j_eof) begin
k_header_i <= '0;
k_footer_i <= 1'b1;
if(~k_footer_i) begin
k_last <= 1'b1;
k_e <= 1'b1;
k_data <= jls_footer;
end
end else begin
k_header_i <= '0;
k_footer_i <= '0;
end
end
//-------------------------------------------------------------------------------------------------------------------
// linebuffer for context pixels
//-------------------------------------------------------------------------------------------------------------------
reg [7:0] linebuffer [1<<14];
always @ (posedge clk) // line buffer read
c_d <= linebuffer[a_ii];
always @ (posedge clk) // line buffer write
if(e_e) linebuffer[e_ii] <= e_x;
//-------------------------------------------------------------------------------------------------------------------
// output signal
//-------------------------------------------------------------------------------------------------------------------
assign o_last = k_last;
assign o_e = k_e;
assign o_data = k_data;
endmodule

BIN
SIM/decoder.exe Normal file

Binary file not shown.

5
SIM/images/test000.pgm Normal file
View File

@ -0,0 +1,5 @@
P5
5
1
255
<EFBFBD>e<>J

BIN
SIM/images/test001.pgm Normal file

Binary file not shown.

14
SIM/images/test002.pgm Normal file

File diff suppressed because one or more lines are too long

5
SIM/images/test003.pgm Normal file

File diff suppressed because one or more lines are too long

4
SIM/images/test004.pgm Normal file

File diff suppressed because one or more lines are too long

79
SIM/images/test005.pgm Normal file

File diff suppressed because one or more lines are too long

417
SIM/images/test006.pgm Normal file

File diff suppressed because one or more lines are too long

502
SIM/images/test007.pgm Normal file

File diff suppressed because one or more lines are too long

534
SIM/images/test008.pgm Normal file

File diff suppressed because one or more lines are too long

228
SIM/tb_jls_encoder.sv Normal file
View File

@ -0,0 +1,228 @@
//--------------------------------------------------------------------------------------------------------
// Module : tb_jls_encoder
// Type : simulation, top
// Standard: SystemVerilog 2005 (IEEE1800-2005)
// Function: testbench for jls_encoder,
// load some .pgm files (uncompressed image file), and push them to jls_encoder.
// get output JPEG-LS stream from jls_encoder, and write them to .jls files (JPEG-LS image file)
//--------------------------------------------------------------------------------------------------------
`timescale 1ps/1ps
`define NEAR 1 // NEAR can be 0~7
`define FILE_NO_FIRST 1 // first input file name is test000.pgm
`define FILE_NO_FINAL 8 // final input file name is test000.pgm
// bubble numbers that insert between pixels
// when = 0, do not insert bubble
// when > 0, insert BUBBLE_CONTROL bubbles
// when < 0, insert random 0~(-BUBBLE_CONTROL) bubbles
`define BUBBLE_CONTROL -2
// the input and output file names' format
`define FILE_NAME_FORMAT "test%03d"
// input file (uncompressed .pgm file) directory
`define INPUT_PGM_DIR "./images"
// output file (compressed .jls file) directory
`define OUTPUT_JLS_DIR "./"
module tb_jls_encoder ();
initial $dumpvars(1, tb_jls_encoder);
// -------------------------------------------------------------------------------------------------------------------
// generate clock and reset
// -------------------------------------------------------------------------------------------------------------------
reg rstn = 1'b0;
reg clk = 1'b0;
always #50000 clk = ~clk; // 10MHz
initial begin repeat(4) @(posedge clk); rstn<=1'b1; end
// -------------------------------------------------------------------------------------------------------------------
// signals for jls_encoder_i module
// -------------------------------------------------------------------------------------------------------------------
reg i_sof = '0;
reg [13:0] i_w = '0;
reg [13:0] i_h = '0;
reg i_e = '0;
reg [ 7:0] i_x = '0;
wire o_e;
wire[15:0] o_data;
wire o_last;
logic [7:0] img [4096*4096];
int w = 0, h = 0;
task automatic load_img(input logic [256*8:1] fname);
int linelen, depth=0, scanf_num;
logic [256*8-1:0] line;
int fp = $fopen(fname, "rb");
if(fp==0) begin
$display("*** error: could not open file %s", fname);
$finish;
end
linelen = $fgets(line, fp);
if(line[8*(linelen-2)+:16] != 16'h5035) begin
$display("*** error: the first line must be P5");
$fclose(fp);
$finish;
end
scanf_num = $fgets(line, fp);
scanf_num = $sscanf(line, "%d%d", w, h);
if(scanf_num == 1) begin
scanf_num = $fgets(line, fp);
scanf_num = $sscanf(line, "%d", h);
end
scanf_num = $fgets(line, fp);
scanf_num = $sscanf(line, "%d", depth);
if(depth!=255) begin
$display("*** error: images depth must be 255");
$fclose(fp);
$finish;
end
for(int i=0; i<h*w; i++)
img[i] = $fgetc(fp);
$fclose(fp);
endtask
// -------------------------------------------------------------------------------------------------------------------
// task: feed image pixels to jls_encoder_i module
// arguments:
// w : image width
// h : image height
// bubble_control : bubble numbers that insert between pixels
// when = 0, do not insert bubble
// when > 0, insert bubble_control bubbles
// when < 0, insert random 0~bubble_control bubbles
// -------------------------------------------------------------------------------------------------------------------
task automatic feed_img(input int bubble_control);
int num_bubble;
// start feeding a image by assert i_sof for 368 cycles
repeat(368) begin
@(posedge clk)
i_sof <= 1'b1;
i_w <= w - 1;
i_h <= h - 1;
{i_e, i_x} <= '0;
end
// for all pixels of the image
for(int i=0; i<h*w; i++) begin
// calculate how many bubbles to insert
if(bubble_control<0) begin
num_bubble = $random % (1-bubble_control);
if(num_bubble<0)
num_bubble = -num_bubble;
end else begin
num_bubble = bubble_control;
end
// insert bubbles
repeat(num_bubble) @(posedge clk) {i_sof, i_w, i_h, i_e, i_x} <= '0;
// assert i_e to input a pixel
@(posedge clk)
{i_sof, i_w, i_h} <= '0;
i_e <= 1'b1;
i_x <= img[i];
end
// 16 cycles idle between images
repeat(16) @(posedge clk) {i_sof, i_w, i_h, i_e, i_x} <= '0;
endtask
// -------------------------------------------------------------------------------------------------------------------
// jls_encoder_i module
// -------------------------------------------------------------------------------------------------------------------
jls_encoder #(
.NEAR ( `NEAR )
) jls_encoder_i (
.rstn ( rstn ),
.clk ( clk ),
.i_sof ( i_sof ),
.i_w ( i_w ),
.i_h ( i_h ),
.i_e ( i_e ),
.i_x ( i_x ),
.o_e ( o_e ),
.o_data ( o_data ),
.o_last ( o_last )
);
// -------------------------------------------------------------------------------------------------------------------
// read images, feed them to jls_encoder_i module
// -------------------------------------------------------------------------------------------------------------------
int file_no; // file number
initial begin
logic [256*8:1] input_file_name;
logic [256*8:1] input_file_format;
$sformat(input_file_format , "%s\\%s.pgm", `INPUT_PGM_DIR, `FILE_NAME_FORMAT);
while(~rstn) @ (posedge clk);
for(file_no=`FILE_NO_FIRST; file_no<=`FILE_NO_FINAL; file_no=file_no+1) begin
$sformat(input_file_name, input_file_format , file_no);
load_img(input_file_name);
$display("%s (%5dx%5d)", input_file_name, w, h);
if( w < 5 || w > 16384 || h < 1 || h > 16383 ) // image size not supported
$display(" *** image size not supported ***");
else
feed_img(`BUBBLE_CONTROL);
end
repeat(100) @(posedge clk);
$finish;
end
// -------------------------------------------------------------------------------------------------------------------
// write output stream to .jls files
// -------------------------------------------------------------------------------------------------------------------
logic [256*8:1] output_file_format;
initial $sformat(output_file_format, "%s\\%s.jls", `OUTPUT_JLS_DIR, `FILE_NAME_FORMAT);
logic [256*8:1] output_file_name;
int opened = 0;
int jls_file = 0;
always @ (posedge clk)
if(o_e) begin
// the first data of an output stream, open a new file.
if(opened == 0) begin
opened = 1;
$sformat(output_file_name, output_file_format, file_no);
jls_file = $fopen(output_file_name , "wb");
end
// write data to file.
if(opened != 0 && jls_file != 0)
$fwrite(jls_file, "%c%c", o_data[15:8], o_data[7:0]);
// if it is the last data of an output stream, close the file.
if(o_last) begin
opened = 0;
$fclose(jls_file);
end
end
endmodule

View File

@ -0,0 +1,5 @@
del sim.out dump.vcd
iverilog -g2005-sv -o sim.out tb_jls_encoder.sv ../RTL/jls_encoder.sv
vvp -n sim.out
del sim.out
pause