change to Verilog2001

This commit is contained in:
WangXuan95 2023-06-03 21:15:55 +08:00
parent a03807ee47
commit 47df5b6219
5 changed files with 459 additions and 365 deletions

1
.gitignore vendored
View File

@ -1,2 +1,3 @@
**/vivado
**/quartus
FPGA_jls_encoder_test

341
README.md
View File

@ -1,173 +1,33 @@
![语言](https://img.shields.io/badge/语言-systemverilog_(IEEE1800_2005)-CAD09D.svg) ![仿真](https://img.shields.io/badge/仿真-iverilog-green.svg) ![部署](https://img.shields.io/badge/部署-quartus-blue.svg) ![部署](https://img.shields.io/badge/部署-vivado-FF1010.svg)
中文 | [English](#en)
FPGA JPEG-LS image compressor
===========================
基于 **FPGA** 的流式的 **JPEG-LS** 图象压缩器,特点是:
* 用于压缩 **8bit** 的灰度图像。
* 可选**无损模式**,即 NEAR=0 。
* 可选**有损模式**NEAR=1~7 可调。
* 图像宽度取值范围为 [5,16384],高度取值范围为 [1,16384]。
* 极简流式输入输出。
# 背景知识
**JPEG-LS** (简称**JLS**)是一种无损/有损的图像压缩算法,其无损模式的压缩率相当优异,优于 Lossless-JPEG、Lossless-JPEG2000、Lossless-JPEG-XR、FELICES 等。**JPEG-LS** 用压缩前后的像素的最大差值(**NEAR**值)来控制失真,无损模式下 **NEAR=0**;有损模式下**NEAR>0****NEAR** 越大,失真越大,压缩率也越大。**JPEG-LS** 压缩图像的文件后缀是 .**jls** 。
# 使用方法
RTL 目录中的 [**jls_encoder.sv**](./RTL/jls_encoder.sv) 是用户可以调用的 JPEG-LS 压缩模块,它输入图像原始像素,输出 JPEG-LS 压缩流。
## 模块参数
**jls_encoder** 只有一个参数:
```verilog
parameter logic [2:0] NEAR
```
决定了 **NEAR** 值,取值为 3'd0 时,工作在无损模式;取值为 3'd1~3'd7 时,工作在有损模式。
## 模块信号
**jls_encoder** 的输入输出信号描述如下表。
| 信号名称 | 全称 | 方向 | 宽度 | 描述 |
| :---: | :---: | :---: | :---: | :--- |
| rstn | 同步复位 | input | 1bit | 当时钟上升沿时若 rstn=0模块复位正常使用时 rstn=1 |
| clk | 时钟 | input | 1bit | 时钟,所有信号都应该于 clk 上升沿对齐。 |
| i_sof | 图像开始 | input | 1bit | 当需要输入一个新的图像时保持至少368个时钟周期的 i_sof=1 |
| i_w | 图像宽度-1 | input | 14bit | 例如图像宽度为 1920则 i_w 应该置为 14d1919。需要在 i_sof=1 时保持有效。 |
| i_h | 图像高度-1 | input | 14bit | 例如图像宽度为 1080则 i_h 应该置为 14d1079。需要在 i_sof=1 时保持有效。 |
| i_e | 输入像素有效 | input | 1bit | 当 i_e=1 时,一个像素需要被输入到 i_x 上。 |
| i_x | 输入像素 | input | 8bit | 像素取值范围为 8'd0 ~ 8'd255 。 |
| o_e | 输出有效 | output | 1bit | 当 o_e=1 时,输出流数据产生在 o_data 上。 |
| o_data | 输出流数据 | output | 16bit | 大端序o_data[15:8] 在先o_data[7:0] 在后。 |
| o_last | 输出流末尾 | output | 1bit | 当 o_e=1 时若 o_last=1 ,说明这是一张图象的输出流的最后一个数据。 |
> 注i_w 不能小于 14'd4 。
## 输入图片
**jls_encoder 模块**的操作的流程是:
1. **复位**(可选):令 rstn=0 至少 **1 个周期**进行复位,之后正常工作时都保持 rstn=1。实际上也可以不复位即让 rstn 恒为1
2. **开始**:保持 i_sof=1 **至少 368 个周期**,同时在 i_w 和 i_h 信号上输入图像的宽度和高度i_sof=1 期间 i_w 和 i_h 要一直保持有效。
3. **输入**:控制 i_e 和 i_x从左到右从上到下地输入该图像的所有像素。当 i_e=1 时i_x 作为一个像素被输入。
4. **图像间空闲**:所有像素输入结束后,需要空闲**至少 16 个周期**不做任何动作(即 i_sof=0i_e=0。然后才能跳到第2步开始下一个图像。
i_sof=1 和 i_e=1 之间;以及 i_e=1 各自之间可以插入任意个空闲气泡(即, i_sof=0i_e=0这意味着我们可以断断续续地输入像素当然不插入任何气泡才能达到最高性能
下图展示了压缩 2 张图像的输入时序图(//代表省略若干周期X代表don't care。其中图像 1 在输入第一个像素后插入了 1 个气泡;而图像 2 在 i_sof=1 后插入了 1 个气泡。注意**图像间空闲**必须至少 **16 个周期**
__ __// __ __ __ __ //_ __ // __ __// __ __ __ // __
clk \__/ \__/ //_/ \__/ \__/ \__/ \__// \__/ \__///\__/ \__/ //_/ \__/ \__/ \__///\__/ \_
_______//________ // // _______//________ //
i_sof ____/ // \________________//___________//____/ // \___________//________
_______//________ // // _______//________ //
i_w XXXXX_______//________XXXXXXXXXXXXXXXXX//XXXXXXXXXXX//XXXXX_______//________XXXXXXXXXXXX//XXXXXXXX
_______//________ // // _______//________ //
i_h XXXXX_______//________XXXXXXXXXXXXXXXXX//XXXXXXXXXXX//XXXXX_______//________XXXXXXXXXXXX//XXXXXXXX
// _____ ____//_____ // // _____//____
i_e ____________//________/ \_____/ // \_____//____________//______________/ // \___
// _____ ____//_____ // // _____//____
i_x XXXXXXXXXXXX//XXXXXXXXX_____XXXXXXX____//_____XXXXXX//XXXXXXXXXXXX//XXXXXXXXXXXXXXX_____//____XXXX
阶段: | 开始图像1 | 输入图像1 | 图像间空闲 | 开始图像2 | 输入图像2
## 输出压缩流
在输入过程中,**jls_encoder** 同时会输出压缩好的 **JPEG-LS流**,该流构成了完整的 .jls 文件的内容包括文件头部和尾部。o_e=1 时o_data 是一个有效输出数据。其中o_data 遵循大端序,即 o_data[15:8] 在流中的位置靠前o_data[7:0] 在流中的位置靠后。在每个图像的输出流遇到最后一个数据时o_last=1 指示一张图像的压缩流结束。
# 仿真
仿真相关文件都在 SIM 目录里,包括:
* tb_jls_encoder.sv 是针对 jls_encoder 的 testbench。行为是将指定文件夹里的 .pgm 格式的未压缩图像批量送入 jls_encoder 进行压缩,然后将 jls_encoder 的输出结果保存到 .jls 文件里。
* tb_jls_encoder_run_iverilog.bat 包含了执行 iverilog 仿真的命令。
* images 文件夹包含几张 .pgm 格式的图像文件。 .pgm 格式存储的是未压缩(也就是存储原始像素)的 8bit 灰度图像,可以使用 photoshop 软件或 Linux 图像查看器就能打开它Windows图像查看器查看不了它
> .pgm 文件格式非常简单,只有一个文件头来指示图像的长宽,然后紧接着就存放图像的所有原始像素。因此我选用 .pgm 文件作为仿真的输入文件,因为只需要在 testbench 中简单地编写一些代码就能解析 .pgm 文件,并把其中的像素取出发给 jls_encoder 。不过,你可以不关注 pgm 文件的格式,因为 jls_encoder 的工作与 pgm 格式并没有关系,它只需要接受图像的原始像素作为输入即可。你只需关注仿真的波形,关注图像像素是如何被送入 jls_encoder 中即可。
使用 iverilog 进行仿真前,需要安装 iverilog ,见:[iverilog_usage](https://github.com/WangXuan95/WangXuan95/blob/main/iverilog_usage/iverilog_usage.md)
然后双击 tb_jls_encoder_run_iverilog.bat 就可以运行仿真,该仿真需要运行十几分钟。
仿真结束后,你可以看到文件夹中产生了几个 .jls 文件,它们就是压缩得到的图像文件。另外,仿真还产生了波形文件 dump.vcd ,你可以用 gtkwave 打开 dump.vcd 来查看波形。
另外,你还可以修改一些仿真参数来进行:
- 修改 tb_jls_encoder.sv 里的宏名 **NEAR** 来改变压缩率。
- 修改 tb_jls_encoder.sv 里的宏名 **BUBBLE_CONTROL** 来决定输入相邻的像素间插入多少个气泡:
- **BUBBLE_CONTROL=0** 时,不插入任何气泡。
- **BUBBLE_CONTROL>0** 时,插入 **BUBBLE_CONTROL **个气泡。
- **BUBBLE_CONTROL<0** 时,每次插入随机的 **0~(-BUBBLE_CONTROL)** 个气泡
> 在不同 NEAR 值和 BUBBLE_CONTROL 值下本库已经经过了几百张照片的结果对比验证充分保证无bug。这部分自动化验证代码就没放上来了
## 查看压缩结果
因为 **JPEG-LS** 比较小众和专业,大多数图片查看软件无法查看 .jls 文件。
你可以试试用[该网站](https://filext.com/file-extension/JLS)来查看 .jls 文件(不过这个网站时常失效)。
如果该网站失效,可以用我提供的解压器 decoder.exe 来把它解压回 .pgm 文件再查看。请在 SIM 目录下用 CMD 运行命令:
```powershell
.\decoder.exe <JLS_FILE_NAME> <PGM_FILE_NAME>
```
例如:
```powershell
.\decoder.exe test000.jls tmp.pgm
```
> 注decoder.exe 编译自 UBC 提供的 C 语言源码: http://www.stat.columbia.edu/~jakulin/jpeg-ls/mirror.htm
# FPGA 部署
在 Xilinx Artix-7 xc7a35tcsg324-2 上,综合和实现的结果如下。
| LUT | FF | BRAM | 最高时钟频率 |
| :--------: | :------: | :----------------------------: | :----------: |
| 2347 (11%) | 932 (2%) | 9个RAMB18 (9%),等效于 144Kbit | 35 MHz |
35MHz 下,图像压缩的性能为 35 Mpixel/s ,对 1920x1080 图像的压缩帧率是 16.8fps 。
![语言](https://img.shields.io/badge/语言-verilog_(IEEE1364_2001)-9A90FD.svg) ![仿真](https://img.shields.io/badge/仿真-iverilog-green.svg) ![部署](https://img.shields.io/badge/部署-quartus-blue.svg) ![部署](https://img.shields.io/badge/部署-vivado-FF1010.svg)
[English](#en) | [中文](#cn)
 
<span id="en">FPGA JPEG-LS image compressor</span>
===========================
**FPGA** based streaming **JPEG-LS** image compressor, features:
* Pure Verilog design, compatible with various FPGA platforms.
* For compressing **8bit** grayscale images.
* Support **lossless mode**, i.e. NEAR=0 .
* Support **lossy mode**, NEAR=1~7 adjustable.
* The value range of image width is [5,16384], and the value range of height is [1,16384].
* Minimalist streaming input and output.
* Simple streaming input and output.
 
# Background
**JPEG-LS** (abbreviated as **JLS**) is a lossless/lossy image compression algorithm which has the best lossless compression ratio compared to JPEG2000 and JPEG-XR. **JPEG-LS** uses the maximum difference between the pixels before and after compression (**NEAR** value) to control distortion, **NEAR=0** is the lossless mode; **NEAR>0** is the lossy mode, the larger the **NEAR**, the greater the distortion and the greater the compression ratio. The file suffix name for **JPEG-LS** compressed image is .**jls** .
**JPEG-LS** (**JLS**) is a lossless/lossy image compression algorithm which has the best lossless compression ratio compared to PNG, Lossless-JPEG2000, Lossless-WEBP, Lossless-HEIF, etc. **JPEG-LS** uses the maximum difference between the pixels before and after compression (**NEAR** value) to control distortion, **NEAR=0** is the lossless mode; **NEAR>0** is the lossy mode, the larger the **NEAR**, the greater the distortion and the greater the compression ratio. The file suffix name for **JPEG-LS** compressed image is .**jls** .
JPEG-LS has two generations:
- JPEG-LS baseline (ITU-T T.87): JPEG-LS refers to the JPEG-LS baseline by default. **This repo implements the encoder of JPEG-LS baseline**. If you are interested in the software version of JPEG-LS baseline encoder, see https://github.com/WangXuan95/JPEG-LS (C language)
- JPEG-LS extension (ITU-T T.870): Its compression ratio is higher than JPEG-LS baseline, but it is very rarely (even no code can be found online). **This repo is not about JPEG-LS extension**. However, I have a C implemented of JPEG-LS extension, see https://github.com/WangXuan95/JPEG-LS_extension
 
# Module Usage
@ -234,7 +94,7 @@ The following figure shows the input timing diagram of compressing 2 images (//r
During the input, **jls_encoder** will also output a compressed **JPEG-LS stream**, which constitutes the content of the complete .jls file (including the file header and trailer). When `o_e=1`, `o_data` is a valid output data. Among them, `o_data` follows the big endian order, that is, `o_data[15:8]` is at the front of the stream, and `o_data[7:0]` is at the back of the stream. `o_last=1` indicates the end of the compressed stream for an image when the output stream for each image encounters the last data.
 
# RTL Simulation
@ -260,7 +120,7 @@ In addition, you can also modify some simulation parameters:
- When **BUBBLE_CONTROL>0**, insert **BUBBLE_CONTROL ** bubbles.
- When **BUBBLE_CONTROL<0**, insert random **0~(-BUBBLE_CONTROL)** bubbles each time.
 
## View compressed JLS file
@ -282,7 +142,7 @@ For example:
> Note: decoder.exe is compiled from the C language source code provided by UBC : http://www.stat.columbia.edu/~jakulin/jpeg-ls/mirror.htm
 
# FPGA Deployment
@ -294,3 +154,174 @@ On Xilinx Artix-7 xc7a35tcsg324-2, the synthesized and implemented results are a
At 35MHz, the image compression performance is 35 Mpixel/s, which means the compression frame rate for 1920x1080 images is 16.8fps.
 
# Reference
- ITU-T T.87 : Information technology Lossless and near-lossless compression of continuous-tone still images Baseline : https://www.itu.int/rec/T-REC-T.87/en
- UBC's JPEG-LS baseline Public Domain Code : http://www.stat.columbia.edu/~jakulin/jpeg-ls/mirror.htm
- Simple JPEG-LS baseline encoder in C language : https://github.com/WangXuan95/JPEG-LS
 
 
 
<span id="cn">FPGA JPEG-LS image compressor</span>
===========================
基于 **FPGA** 的流式的 **JPEG-LS** 图像压缩器,特点是:
* 纯 Verilog 设计可在各种FPGA型号上部署
* 用于压缩 **8bit** 的灰度图像。
* 可选**无损模式**,即 NEAR=0 。
* 可选**有损模式**NEAR=1~7 可调。
* 图像宽度取值范围为 [5,16384],高度取值范围为 [1,16384]。
* 极简流式输入输出。
 
# 背景知识
**JPEG-LS** (简称**JLS**)是一种无损/有损的图像压缩算法,其无损模式的压缩率相当优异,优于 PNG、Lossless-JPEG2000、Lossless-WEBP、Lossless-HEIF 等。**JPEG-LS** 用压缩前后的像素的最大差值(**NEAR**值)来控制失真,无损模式下 **NEAR=0**;有损模式下**NEAR>0****NEAR** 越大,失真越大,压缩率也越大。**JPEG-LS** 压缩图像的文件后缀是 .**jls** 。
JPEG-LS 有两代:
- JPEG-LS baseline (ITU-T T.87) : 一般提到 JPEG-LS 默认都是指 JPEG-LS baseline。**本库也实现的是 JPEG-LS baseline 的 encoder** 。如果你对软件版本的 JPEG-LS baseline encoder 感兴趣,可以看 https://github.com/WangXuan95/JPEG-LS (C语言实现)
- JPEG-LS extension (ITU-T T.870) : 其压缩率高于 JPEG-LS baseline ,但使用的非常少 (在网上搜不到任何代码) 。**本库与 JPEG-LS extension 无关!**不过我依照 ITU-T T.870 实现了 C 语言的 JPEG-LS extension见 https://github.com/WangXuan95/JPEG-LS_extension
 
# 使用方法
RTL 目录中的 [**jls_encoder.sv**](./RTL/jls_encoder.sv) 是用户可以调用的 JPEG-LS 压缩模块,它输入图像原始像素,输出 JPEG-LS 压缩流。
## 模块参数
**jls_encoder** 只有一个参数:
```verilog
parameter logic [2:0] NEAR
```
决定了 **NEAR** 值,取值为 3'd0 时,工作在无损模式;取值为 3'd1~3'd7 时,工作在有损模式。
## 模块信号
**jls_encoder** 的输入输出信号描述如下表。
| 信号名称 | 全称 | 方向 | 宽度 | 描述 |
| :---: | :---: | :---: | :---: | :--- |
| rstn | 同步复位 | input | 1bit | 当时钟上升沿时若 rstn=0模块复位正常使用时 rstn=1 |
| clk | 时钟 | input | 1bit | 时钟,所有信号都应该于 clk 上升沿对齐。 |
| i_sof | 图像开始 | input | 1bit | 当需要输入一个新的图像时保持至少368个时钟周期的 i_sof=1 |
| i_w | 图像宽度-1 | input | 14bit | 例如图像宽度为 1920则 i_w 应该置为 14d1919。需要在 i_sof=1 时保持有效。 |
| i_h | 图像高度-1 | input | 14bit | 例如图像宽度为 1080则 i_h 应该置为 14d1079。需要在 i_sof=1 时保持有效。 |
| i_e | 输入像素有效 | input | 1bit | 当 i_e=1 时,一个像素需要被输入到 i_x 上。 |
| i_x | 输入像素 | input | 8bit | 像素取值范围为 8'd0 ~ 8'd255 。 |
| o_e | 输出有效 | output | 1bit | 当 o_e=1 时,输出流数据产生在 o_data 上。 |
| o_data | 输出流数据 | output | 16bit | 大端序o_data[15:8] 在先o_data[7:0] 在后。 |
| o_last | 输出流末尾 | output | 1bit | 当 o_e=1 时若 o_last=1 ,说明这是一张图像的输出流的最后一个数据。 |
> 注i_w 不能小于 14'd4 。
## 输入图片
**jls_encoder 模块**的操作的流程是:
1. **复位**(可选):令 rstn=0 至少 **1 个周期**进行复位,之后正常工作时都保持 rstn=1。实际上也可以不复位即让 rstn 恒为1
2. **开始**:保持 i_sof=1 **至少 368 个周期**,同时在 i_w 和 i_h 信号上输入图像的宽度和高度i_sof=1 期间 i_w 和 i_h 要一直保持有效。
3. **输入**:控制 i_e 和 i_x从左到右从上到下地输入该图像的所有像素。当 i_e=1 时i_x 作为一个像素被输入。
4. **图像间空闲**:所有像素输入结束后,需要空闲**至少 16 个周期**不做任何动作(即 i_sof=0i_e=0。然后才能跳到第2步开始下一个图像。
i_sof=1 和 i_e=1 之间;以及 i_e=1 各自之间可以插入任意个空闲气泡(即, i_sof=0i_e=0这意味着我们可以断断续续地输入像素当然不插入任何气泡才能达到最高性能
下图展示了压缩 2 张图像的输入时序图(//代表省略若干周期X代表don't care。其中图像 1 在输入第一个像素后插入了 1 个气泡;而图像 2 在 i_sof=1 后插入了 1 个气泡。注意**图像间空闲**必须至少 **16 个周期**
__ __// __ __ __ __ //_ __ // __ __// __ __ __ // __
clk \__/ \__/ //_/ \__/ \__/ \__/ \__// \__/ \__///\__/ \__/ //_/ \__/ \__/ \__///\__/ \_
_______//________ // // _______//________ //
i_sof ____/ // \________________//___________//____/ // \___________//________
_______//________ // // _______//________ //
i_w XXXXX_______//________XXXXXXXXXXXXXXXXX//XXXXXXXXXXX//XXXXX_______//________XXXXXXXXXXXX//XXXXXXXX
_______//________ // // _______//________ //
i_h XXXXX_______//________XXXXXXXXXXXXXXXXX//XXXXXXXXXXX//XXXXX_______//________XXXXXXXXXXXX//XXXXXXXX
// _____ ____//_____ // // _____//____
i_e ____________//________/ \_____/ // \_____//____________//______________/ // \___
// _____ ____//_____ // // _____//____
i_x XXXXXXXXXXXX//XXXXXXXXX_____XXXXXXX____//_____XXXXXX//XXXXXXXXXXXX//XXXXXXXXXXXXXXX_____//____XXXX
阶段: | 开始图像1 | 输入图像1 | 图像间空闲 | 开始图像2 | 输入图像2
## 输出压缩流
在输入过程中,**jls_encoder** 同时会输出压缩好的 **JPEG-LS流**,该流构成了完整的 .jls 文件的内容包括文件头部和尾部。o_e=1 时o_data 是一个有效输出数据。其中o_data 遵循大端序,即 o_data[15:8] 在流中的位置靠前o_data[7:0] 在流中的位置靠后。在每个图像的输出流遇到最后一个数据时o_last=1 指示一张图像的压缩流结束。
 
# 仿真
仿真相关文件都在 SIM 目录里,包括:
* tb_jls_encoder.sv 是针对 jls_encoder 的 testbench。行为是将指定文件夹里的 .pgm 格式的未压缩图像批量送入 jls_encoder 进行压缩,然后将 jls_encoder 的输出结果保存到 .jls 文件里。
* tb_jls_encoder_run_iverilog.bat 包含了执行 iverilog 仿真的命令。
* images 文件夹包含几张 .pgm 格式的图像文件。 .pgm 格式存储的是未压缩(也就是存储原始像素)的 8bit 灰度图像,可以使用 photoshop 软件或 Linux 图像查看器就能打开它Windows图像查看器查看不了它
> .pgm 文件格式非常简单,只有一个文件头来指示图像的长宽,然后紧接着就存放图像的所有原始像素。因此我选用 .pgm 文件作为仿真的输入文件,因为只需要在 testbench 中简单地编写一些代码就能解析 .pgm 文件,并把其中的像素取出发给 jls_encoder 。不过,你可以不关注 pgm 文件的格式,因为 jls_encoder 的工作与 pgm 格式并没有关系,它只需要接受图像的原始像素作为输入即可。你只需关注仿真的波形,关注图像像素是如何被送入 jls_encoder 中即可。
使用 iverilog 进行仿真前,需要安装 iverilog ,见:[iverilog_usage](https://github.com/WangXuan95/WangXuan95/blob/main/iverilog_usage/iverilog_usage.md)
然后双击 tb_jls_encoder_run_iverilog.bat 就可以运行仿真,该仿真需要运行十几分钟。
仿真结束后,你可以看到文件夹中产生了几个 .jls 文件,它们就是压缩得到的图像文件。另外,仿真还产生了波形文件 dump.vcd ,你可以用 gtkwave 打开 dump.vcd 来查看波形。
另外,你还可以修改一些仿真参数来进行:
- 修改 tb_jls_encoder.sv 里的宏名 **NEAR** 来改变压缩率。
- 修改 tb_jls_encoder.sv 里的宏名 **BUBBLE_CONTROL** 来决定输入相邻的像素间插入多少个气泡:
- **BUBBLE_CONTROL=0** 时,不插入任何气泡。
- **BUBBLE_CONTROL>0** 时,插入 **BUBBLE_CONTROL **个气泡。
- **BUBBLE_CONTROL<0** 时,每次插入随机的 **0~(-BUBBLE_CONTROL)** 个气泡
> 在不同 NEAR 值和 BUBBLE_CONTROL 值下本库已经经过了几百张照片的结果对比验证充分保证无bug。这部分自动化验证代码就没放上来了
## 查看压缩结果
因为 **JPEG-LS** 比较小众和专业,大多数图片查看软件无法查看 .jls 文件。
你可以试试用[该网站](https://filext.com/file-extension/JLS)来查看 .jls 文件(不过这个网站时常失效)。
如果该网站失效,可以用我提供的解压器 decoder.exe 来把它解压回 .pgm 文件再查看。请在 SIM 目录下用 CMD 运行命令:
```powershell
.\decoder.exe <JLS_FILE_NAME> <PGM_FILE_NAME>
```
例如:
```powershell
.\decoder.exe test000.jls tmp.pgm
```
> 注decoder.exe 编译自 UBC 提供的 C 语言源码: http://www.stat.columbia.edu/~jakulin/jpeg-ls/mirror.htm
 
# FPGA 部署
在 Xilinx Artix-7 xc7a35tcsg324-2 上,综合和实现的结果如下。
| LUT | FF | BRAM | 最高时钟频率 |
| :--------: | :------: | :----------------------------: | :----------: |
| 2347 (11%) | 932 (2%) | 9个RAMB18 (9%),等效于 144Kbit | 35 MHz |
35MHz 下,图像压缩的性能为 35 Mpixel/s ,对 1920x1080 图像的压缩帧率是 16.8fps 。
 
# 相关链接
- ITU-T T.87 : Information technology Lossless and near-lossless compression of continuous-tone still images Baseline : https://www.itu.int/rec/T-REC-T.87/en
- UBC's JPEG-LS baseline Public Domain Code : http://www.stat.columbia.edu/~jakulin/jpeg-ls/mirror.htm
- 精简的 JPEG-LS baseline 编码器 (C语言) : https://github.com/WangXuan95/JPEG-LS

View File

@ -2,12 +2,12 @@
//--------------------------------------------------------------------------------------------------------
// Module : jls_encoder
// Type : synthesizable, IP's top
// Standard: SystemVerilog 2005 (IEEE1800-2005)
// Standard: Verilog 2001 (IEEE1364-2001)
// Function: JPEG-LS image compressor
//--------------------------------------------------------------------------------------------------------
module jls_encoder #(
parameter [2:0] NEAR = 3'd1
parameter [ 2:0] NEAR = 3'd0
) (
input wire rstn,
input wire clk,
@ -21,10 +21,12 @@ module jls_encoder #(
output wire o_last // indicate the last output data of a image
);
//---------------------------------------------------------------------------------------------------------------------------
// local parameters
//---------------------------------------------------------------------------------------------------------------------------
wire [3:0] P_QBPPS [8];
wire [3:0] P_QBPPS [0:7];
assign P_QBPPS[0] = 4'd8;
assign P_QBPPS[1] = 4'd7;
assign P_QBPPS[2] = 4'd6;
@ -34,19 +36,19 @@ assign P_QBPPS[5] = 4'd5;
assign P_QBPPS[6] = 4'd5;
assign P_QBPPS[7] = 4'd5;
localparam logic P_LOSSY = NEAR != '0;
localparam logic signed [8:0] P_NEAR = $signed({6'd0, NEAR});
localparam logic signed [8:0] P_T1 = $signed(9'd3) + $signed(9'd3) * P_NEAR;
localparam logic signed [8:0] P_T2 = $signed(9'd7) + $signed(9'd5) * P_NEAR;
localparam logic signed [8:0] P_T3 = $signed(9'd21)+ $signed(9'd7) * P_NEAR;
localparam logic signed [9:0] P_QUANT = {P_NEAR, 1'b1};
localparam logic signed [9:0] P_QBETA = $signed(10'd256 + {5'd0,NEAR,2'd0}) / P_QUANT;
localparam logic signed [9:0] P_QBETAHALF = (P_QBETA+$signed(10'd1)) / $signed(10'd2);
wire [3:0] P_QBPP = P_QBPPS[NEAR];
wire [4:0] P_LIMIT = 5'd31 - {1'b0, P_QBPP};
localparam logic [12:0] P_AINIT = (NEAR=='0) ? 13'd4 : 13'd2;
localparam P_LOSSY = (NEAR != 3'd0);
localparam signed [8:0] P_NEAR = $signed({6'd0, NEAR});
localparam signed [8:0] P_T1 = $signed(9'd3) + $signed(9'd3) * P_NEAR;
localparam signed [8:0] P_T2 = $signed(9'd7) + $signed(9'd5) * P_NEAR;
localparam signed [8:0] P_T3 = $signed(9'd21)+ $signed(9'd7) * P_NEAR;
localparam signed [9:0] P_QUANT = {P_NEAR, 1'b1};
localparam signed [9:0] P_QBETA = $signed(10'd256 + {5'd0,NEAR,2'd0}) / P_QUANT;
localparam signed [9:0] P_QBETAHALF = (P_QBETA+$signed(10'd1)) / $signed(10'd2);
wire [3:0] P_QBPP = P_QBPPS[NEAR];
wire [4:0] P_LIMIT = 5'd31 - {1'b0, P_QBPP};
localparam [12:0] P_AINIT = (NEAR == 3'd0) ? 13'd4 : 13'd2;
wire [3:0] J [32];
wire [3:0] J [0:31];
assign J[ 0] = 4'd0;
assign J[ 1] = 4'd0;
assign J[ 2] = 4'd0;
@ -85,188 +87,232 @@ assign J[31] = 4'd15;
//---------------------------------------------------------------------------------------------------------------------------
// function: is_near
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic func_is_near(input [7:0] x1, input [7:0] x2);
logic signed [8:0] ex1, ex2;
function [0:0] func_is_near;
input [7:0] x1, x2;
reg signed [8:0] ex1, ex2;
begin
ex1 = $signed({1'b0,x1});
ex2 = $signed({1'b0,x2});
return ex1 - ex2 <= P_NEAR && ex2 - ex1 <= P_NEAR;
func_is_near = ((ex1 - ex2 <= P_NEAR) && (ex2 - ex1 <= P_NEAR));
end
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: predictor (get_px)
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [7:0] func_predictor(input [7:0] a, input [7:0] b, input [7:0] c);
function [7:0] func_predictor;
input [7:0] a, b, c;
begin
if( c>=a && c>=b )
return a>b ? b : a;
func_predictor = (a>b) ? b : a;
else if( c<=a && c<=b )
return a>b ? a : b;
func_predictor = (a>b) ? a : b;
else
return a - c + b;
func_predictor = a - c + b;
end
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: q_quantize
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic signed [3:0] func_q_quantize(input [7:0] x1, input [7:0] x2);
logic signed [8:0] delta;
function signed [3:0] func_q_quantize;
input [7:0] x1, x2;
reg signed [8:0] delta;
begin
delta = $signed({1'b0,x1}) - $signed({1'b0,x2});
if (delta <= -P_T3 )
return -$signed(4'd4);
func_q_quantize = -$signed(4'd4);
else if(delta <= -P_T2 )
return -$signed(4'd3);
func_q_quantize = -$signed(4'd3);
else if(delta <= -P_T1 )
return -$signed(4'd2);
func_q_quantize = -$signed(4'd2);
else if(delta < -P_NEAR )
return -$signed(4'd1);
func_q_quantize = -$signed(4'd1);
else if(delta <= P_NEAR )
return $signed(4'd0);
func_q_quantize = $signed(4'd0);
else if(delta < P_T1 )
return $signed(4'd1);
func_q_quantize = $signed(4'd1);
else if(delta < P_T2 )
return $signed(4'd2);
func_q_quantize = $signed(4'd2);
else if(delta < P_T3 )
return $signed(4'd3);
func_q_quantize = $signed(4'd3);
else
return $signed(4'd4);
func_q_quantize = $signed(4'd4);
end
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: get_q (part 1), qp1 = 81*Q(d-b) + 9*Q(b-c)
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic signed [9:0] func_get_qp1(input [7:0] c, input [7:0] b, input [7:0] d);
return $signed(10'd81) * func_q_quantize(d,b) + $signed(10'd9) * func_q_quantize(b,c);
function signed [9:0] func_get_qp1;
input [7:0] c, b, d;
begin
func_get_qp1 = $signed(10'd81) * func_q_quantize(d,b) + $signed(10'd9) * func_q_quantize(b,c);
end
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: get_q (part 2), get sign(qs) and abs(qs), where qs = qp1 + Q(c-a)
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [9:0] func_get_q(input signed [9:0] qp1, input [7:0] c, input [7:0] a);
logic signed [9:0] qs;
logic s;
logic [8:0] q;
function [9:0] func_get_q;
input signed [9:0] qp1;
input [7:0] c, a;
reg signed [9:0] qs;
reg s;
reg [8:0] q;
begin
qs = qp1 + func_q_quantize(c,a);
s = qs[9];
q = s ? (~qs[8:0]+9'd1) : qs[8:0];
return {s, q};
func_get_q = {s, q};
end
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: clip
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [7:0] func_clip(input signed [9:0] val);
function [7:0] func_clip;
input signed [9:0] val;
begin
if( val > $signed(10'd255) )
return 8'd255;
func_clip = 8'd255;
else if( val < $signed(10'd0) )
return 8'd0;
func_clip = 8'd0;
else
return val[7:0];
func_clip = val[7:0];
end
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: errval_quantize
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic signed [9:0] func_errval_quantize(input signed [9:0] err);
function signed [9:0] func_errval_quantize;
input signed [9:0] err;
begin
if(err[9])
return -( (P_NEAR - err) / P_QUANT );
func_errval_quantize = -( (P_NEAR - err) / P_QUANT );
else
return (P_NEAR + err) / P_QUANT;
func_errval_quantize = (P_NEAR + err) / P_QUANT;
end
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: modrange
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic signed [9:0] func_modrange(input signed [9:0] val);
logic signed [9:0] new_val;
new_val = val;
if( new_val[9] )
new_val += P_QBETA;
if( new_val >= P_QBETAHALF )
new_val -= P_QBETA;
return new_val;
function signed [9:0] func_modrange;
input signed [9:0] val;
begin
func_modrange = val;
if( func_modrange[9] )
func_modrange = func_modrange + P_QBETA;
if( func_modrange >= P_QBETAHALF )
func_modrange = func_modrange - P_QBETA;
end
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: get k
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [3:0] func_get_k(input [12:0] A, input [6:0] N, input rt);
logic [18:0] Nt, At;
logic [ 3:0] k;
function [ 3:0] func_get_k;
input [12:0] A;
input [ 6:0] N;
input rt;
reg [18:0] Nt, At;
reg [ 3:0] ii;
begin
Nt = {12'h0, N};
At = { 6'h0, A};
k = 4'd0;
if(rt)
At += {13'd0, N[6:1]};
for(int ii=0; ii<13; ii++)
func_get_k = 4'd0;
if (rt)
At = At + {13'd0, N[6:1]};
for (ii=4'd0; ii<4'd13; ii=ii+4'd1)
if((Nt<<ii) < At)
k++;
return k;
func_get_k = func_get_k + 4'd1;
end
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: B update for run mode
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [6:0] B_update(input reset, input [6:0] B, input errm0);
function [6:0] B_update;
input reset;
input [6:0] B;
input errm0;
begin
B_update = B;
if(errm0)
B_update ++;
if(reset)
B_update >>>= 1;
if (errm0)
B_update = B_update + 7'd1;
if (reset)
B_update = B_update >>> 1;
end
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: C, B update for regular mode
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [14:0] C_B_update(input reset, input [6:0] N, input signed [7:0] C, input signed [6:0] B, input signed [9:0] err);
logic signed [9:0] Bt;
logic signed [7:0] Ct;
function [14:0] C_B_update;
input reset;
input [6:0] N;
input signed [7:0] C;
input signed [6:0] B;
input signed [9:0] err;
reg signed [9:0] Bt;
reg signed [7:0] Ct;
begin
Bt = B;
Ct = C;
Bt += err * P_QUANT;
Bt = Bt + (err * P_QUANT);
if(reset)
Bt >>>= 1;
Bt = Bt >>> 1;
if( Bt <= -$signed({3'd0,N}) ) begin
Bt += $signed({3'd0,N});
Bt = Bt + $signed({3'd0,N});
if( Bt <= -$signed({3'd0,N}) )
Bt = -$signed({3'd0,N}-10'd1);
if( Ct != $signed(8'd128) )
Ct--;
Ct = Ct - 8'd1;
end else if( Bt > $signed(10'd0) ) begin
Bt -= $signed({3'd0,N});
Bt = Bt - $signed({3'd0,N});
if( Bt > $signed(10'd0) )
Bt = $signed(10'd0);
if( Ct != $signed(8'd127) )
Ct++;
Ct = Ct + 8'd1;
end
return {Ct, Bt[6:0]};
C_B_update = {Ct, Bt[6:0]};
end
endfunction
//---------------------------------------------------------------------------------------------------------------------------
// function: A update
//---------------------------------------------------------------------------------------------------------------------------
function automatic logic [12:0] A_update(input reset, input [12:0] A, input [9:0] inc);
function [12:0] A_update;
input reset;
input [12:0] A;
input [ 9:0] inc;
begin
A_update = A + {3'd0, inc};
if(reset)
A_update >>>= 1;
A_update = A_update >>> 1;
end
endfunction
//-------------------------------------------------------------------------------------------------------------------
// context memorys
//-------------------------------------------------------------------------------------------------------------------
reg [ 5:0] Nram [366];
reg [12:0] Aram [366];
reg signed [ 6:0] Bram [366];
reg [ 5:0] Nram [0:365];
reg [12:0] Aram [0:365];
reg signed [ 6:0] Bram [0:365];
reg signed [ 7:0] Cram [1:364];
@ -285,7 +331,7 @@ reg [14:0] a_jj;
always @ (posedge clk)
if(~rstn) begin
{a_sof, a_e, a_x, a_w, a_h, a_wl, a_hl, a_ii, a_jj} <= '0;
{a_sof, a_e, a_x, a_w, a_h, a_wl, a_hl, a_ii, a_jj} <= 0;
end else begin
a_sof <= i_sof;
a_e <= i_e;
@ -295,13 +341,13 @@ always @ (posedge clk)
if(a_sof) begin
a_wl <= (a_w<14'd4 ? 14'd4 : a_w);
a_hl <= a_h;
a_ii <= '0;
a_jj <= '0;
a_ii <= 14'd0;
a_jj <= 15'd0;
end else if(a_e) begin
if(a_ii < a_wl)
a_ii <= a_ii + 14'd1;
else begin
a_ii <= '0;
a_ii <= 14'd0;
if(a_jj <= {1'b0,a_hl})
a_jj <= a_jj + 15'd1;
end
@ -324,12 +370,12 @@ reg [ 7:0] b_x;
always @ (posedge clk) begin
b_sof <= a_sof & rstn;
if(~rstn | a_sof) begin
{b_e, b_fc, b_lc, b_fr, b_eof, b_ii, b_x} <= '0;
{b_e, b_fc, b_lc, b_fr, b_eof, b_ii, b_x} <= 0;
end else begin
b_e <= a_e & (a_jj <= {1'b0,a_hl});
b_fc <= a_e & (a_ii == '0);
b_fc <= a_e & (a_ii == 14'd0);
b_lc <= a_e & (a_ii == a_wl);
b_fr <= a_e & (a_jj == '0);
b_fr <= a_e & (a_jj == 15'd0);
b_eof <= a_jj > {1'b0,a_hl};
b_ii <= a_ii;
b_x <= a_x;
@ -356,7 +402,7 @@ reg [ 7:0] c_d;
always @ (posedge clk) begin
c_sof <= b_sof & rstn;
if(~rstn | b_sof) begin
{c_e,c_fc,c_lc,c_fr,c_eof,c_ii,c_x,c_b,c_bt,c_c} <= '0;
{c_e,c_fc,c_lc,c_fr,c_eof,c_ii,c_x,c_b,c_bt,c_c} <= 0;
end else begin
c_e <= b_e;
c_fc <= b_fc;
@ -366,10 +412,10 @@ always @ (posedge clk) begin
c_ii <= b_ii;
if(b_e) begin
c_x <= b_x;
c_b <= b_fr ? '0 : c_d;
c_b <= b_fr ? 8'd0 : c_d;
if(b_fr) begin
c_bt <= '0;
c_c <= '0;
c_bt <= 8'd0;
c_c <= 8'd0;
end else if(b_fc) begin
c_bt <= c_d;
c_c <= c_bt;
@ -394,12 +440,13 @@ reg [ 7:0] d_b;
reg [ 7:0] d_c;
reg signed [9:0] d_qp1;
wire [7:0] c_w_d = c_fr ? 8'd0 : (c_lc ? c_b : c_d);
always @ (posedge clk) begin
d_sof <= c_sof & rstn;
if(~rstn | c_sof) begin
{d_e, d_fc, d_lc, d_eof, d_ii, d_x, d_b, d_c, d_qp1} <= '0;
{d_e, d_fc, d_lc, d_eof, d_ii, d_x, d_b, d_c, d_qp1} <= 0;
end else begin
logic [7:0] d;
d_e <= c_e;
d_fc <= c_fc;
d_lc <= c_lc;
@ -408,8 +455,7 @@ always @ (posedge clk) begin
d_x <= c_x;
d_b <= c_b;
d_c <= c_c;
d = c_fr ? '0 : (c_lc ? c_b : c_d);
d_qp1 <= func_get_qp1(c_c, c_b, d);
d_qp1 <= func_get_qp1(c_c, c_b, c_w_d);
end
end
@ -436,48 +482,49 @@ reg [5:0] e_Nn;
reg signed [7:0] e_Cn;
reg signed [6:0] e_Bn;
wire [7:0] d_w_a = d_fc ? d_b : e_x;
reg s; // not real register
reg [8:0] q; // not real register
reg rt; // not real register
reg runi; // not real register
reg rune; // not real register
reg signed [7:0] Co; // not real register
reg [6:0] No, Nn; // not real register
reg signed [6:0] Bo; // not real register
reg signed [9:0] px; // not real register
reg signed [9:0] err; // not real register
always @ (posedge clk) begin
e_sof <= d_sof & rstn;
e_2BleN <= 1'b0;
{e_write_C, e_Cn, e_write_en, e_Bn, e_Nn} <= '0;
{e_write_C, e_Cn, e_write_en, e_Bn, e_Nn} <= 0;
if(~rstn | d_sof) begin
{e_e, e_fc, e_lc, e_eof, e_ii, e_runi, e_rune, e_x, e_q, e_rt, e_err, e_No} <= '0;
{e_e, e_fc, e_lc, e_eof, e_ii, e_runi, e_rune, e_x, e_q, e_rt, e_err, e_No} <= 0;
end else begin
logic [7:0] a;
logic s;
logic [8:0] q;
logic rt;
logic runi;
logic rune;
logic signed [7:0] Co;
logic [6:0] No, Nn;
logic signed [6:0] Bo;
logic signed [9:0] px;
logic signed [9:0] err;
a = d_fc ? d_b : e_x;
rt = 1'b0;
rune = 1'b0;
No = '0;
err = '0;
{s, q} = func_get_q(d_qp1, d_c, a);
No = 0;
err = 0;
{s, q} = func_get_q(d_qp1, d_c, d_w_a);
Co = (e_write_C & e_q==q) ? e_Cn : Cram[q];
runi = ~d_fc & e_runi | (q == 9'd0);
if(runi) begin
runi = func_is_near(d_x, a);
runi = func_is_near(d_x, d_w_a);
rune = ~runi;
end
if(d_e) begin
if(runi) begin
e_x <= P_LOSSY ? a : d_x;
e_x <= P_LOSSY ? d_w_a : d_x;
end else begin
if(rune) begin
rt = func_is_near(d_b, a);
s = {1'b0,a} > ({1'b0,d_b} + {6'd0,NEAR}) ? 1'b1 : 1'b0;
rt = func_is_near(d_b, d_w_a);
s = {1'b0,d_w_a} > ({1'b0,d_b} + {6'd0,NEAR}) ? 1'b1 : 1'b0;
q = rt ? 9'd365 : 9'd0;
px = rt ? a : d_b;
px = rt ? d_w_a : d_b;
end else begin
px[9:8] = 2'b00;
px[7:0] = func_clip( $signed({2'h0, func_predictor(a,d_b,d_c)}) + ( s ? -$signed({Co[7],Co[7],Co}) : $signed({Co[7],Co[7],Co}) ) );
px[7:0] = func_clip( $signed({2'h0, func_predictor(d_w_a, d_b, d_c)}) + ( s ? -$signed({Co[7],Co[7],Co}) : $signed({Co[7],Co[7],Co}) ) );
end
err = s ? px - $signed({2'd0, d_x}) : $signed({2'd0, d_x}) - px;
err = func_errval_quantize(err);
@ -485,7 +532,7 @@ always @ (posedge clk) begin
err = func_modrange(err);
No = ((e_write_en & e_q==q) ? e_Nn : Nram[q]) + 7'd1;
Nn = No;
if(No[6]) Nn >>>= 1;
if(No[6]) Nn = Nn >>> 1;
e_Nn <= Nn[5:0];
Bo = (e_write_en & e_q==q) ? e_Bn : Bram[q];
e_write_en <= 1'b1;
@ -518,6 +565,7 @@ end
// pipeline stage f: write Cram, Bram, Nram
//-------------------------------------------------------------------------------------------------------------------
reg [8:0] NBC_init_addr;
always @ (posedge clk)
NBC_init_addr <= e_sof ? NBC_init_addr + (NBC_init_addr < 9'd366 ? 9'd1 : 9'd0) : 9'd0;
@ -556,7 +604,7 @@ reg ef_write_en;
always @ (posedge clk) begin
ef_sof <= e_sof & rstn;
if(~rstn | e_sof) begin
{ef_e, ef_fc, ef_lc, ef_eof, ef_runi, ef_rune, ef_2BleN, ef_q, ef_rt, ef_err, ef_No, ef_write_en} <= '0;
{ef_e, ef_fc, ef_lc, ef_eof, ef_runi, ef_rune, ef_2BleN, ef_q, ef_rt, ef_err, ef_No, ef_write_en} <= 0;
end else begin
ef_e <= e_e;
ef_fc <= e_fc;
@ -603,50 +651,50 @@ reg g_write_en;
reg [12:0] g_An;
always @ (posedge clk)
if(~rstn | f_sof)
{g_q, g_write_en, g_An} <= '0;
{g_q, g_write_en, g_An} <= 0;
else
{g_q, g_write_en, g_An} <= {f_q, f_write_en, f_An};
reg [ 1:0] on; // not real register
reg [15:0] rc; // not real register
reg [ 4:0] ri; // not real register
reg [12:0] Ao; // not real register
reg [ 3:0] k; // not real register
reg [ 9:0] abserr; // not real register
reg [ 9:0] merr, Ainc; // not real register
reg map; // not real register
always @ (posedge clk) begin
f_sof <= ef_sof & rstn;
f_limit <= P_LIMIT;
f_An <= P_AINIT;
if(~rstn | ef_sof) begin
{f_e, f_eof, f_runi, f_rune, f_merr, f_k, f_rc, f_ri, f_on, f_cb, f_cn, f_q, f_write_en} <= '0;
{f_e, f_eof, f_runi, f_rune, f_merr, f_k, f_rc, f_ri, f_on, f_cb, f_cn, f_q, f_write_en} <= 0;
end else begin
logic [ 1:0] on;
logic [15:0] rc;
logic [ 4:0] ri;
logic [12:0] Ao;
logic [ 3:0] k;
logic [ 9:0] abserr;
logic [ 9:0] merr, Ainc;
logic map;
on = '0;
rc = (ef_fc|~ef_runi) ? '0 : f_rc;
on = 2'd0;
rc = (ef_fc|~ef_runi) ? 16'd0 : f_rc;
ri = f_ri;
Ao = (f_write_en & f_q==ef_q) ? f_An : (g_write_en & g_q==ef_q) ? g_An : ef_Ao;
abserr = ef_err<$signed(10'd0) ? $unsigned(-ef_err) : $unsigned(ef_err);
merr='0;
Ainc='0;
merr = 10'd0;
Ainc = 10'd0;
f_write_en <= ef_write_en;
k = func_get_k(Ao, ef_No, ef_rt);
f_cb <= ef_fc ? '0 : f_rc;
f_cb <= ef_fc ? 16'd0 : f_rc;
f_cn <= {1'b0,J[ri]} + 5'd1;
if(ef_runi) begin
rc ++;
rc = rc + 16'd1;
if(rc >= (16'd1<<J[ri])) begin
on++;
rc -= (16'd1<<J[ri]);
if(ri < 5'd31) ri ++;
on = on + 2'd1;
rc = rc - (16'd1<<J[ri]);
if(ri < 5'd31) ri = ri + 5'd1;
end
if(ef_lc & (rc > 16'd0))
on++;
on = on + 2'd1;
end else if(ef_rune) begin
f_limit <= P_LIMIT - 5'd1 - {1'b0,J[ri]};
if(ri > '0) ri --;
map = ~( (ef_err=='0) | ( (ef_err>$signed(10'd0)) ^ (k==4'd0 & ef_2BleN) ) );
if (ri > 5'd0) ri = ri - 5'd1;
map = ~( (ef_err==10'd0) | ( (ef_err>$signed(10'd0)) ^ (k==4'd0 & ef_2BleN) ) );
merr = (abserr<<1) - {9'd0,ef_rt} - {9'd0,map};
Ainc = ((merr + {9'd0,~ef_rt}) >> 1);
end else begin
@ -700,21 +748,21 @@ reg [ 4:0] g_zn; // in range of 0~27
reg [ 9:0] g_db;
reg [ 3:0] g_dn; // in range of 0~13
wire [ 9:0] f_w_merr_sk = (f_merr >> f_k);
always @ (posedge clk) begin
g_sof <= f_sof & rstn;
if(~rstn | f_sof) begin
{g_e, g_eof, g_runi, g_on, g_cb, g_cn, g_zn, g_db, g_dn} <= '0;
{g_e, g_eof, g_runi, g_on, g_cb, g_cn, g_zn, g_db, g_dn} <= 0;
end else begin
logic [9:0] merr_sk;
merr_sk = f_merr >> f_k;
g_e <= f_e;
g_eof <= f_eof;
g_runi <= f_runi;
g_on <= f_on;
g_cb <= f_rune ? f_cb : '0;
g_cn <= f_rune ? f_cn : '0;
if(merr_sk < f_limit) begin
g_zn <= merr_sk[4:0];
g_cb <= f_rune ? f_cb : 16'd0;
g_cn <= f_rune ? f_cn : 5'd0;
if(f_w_merr_sk < f_limit) begin
g_zn <= f_w_merr_sk[4:0];
g_db <= f_merr & ~(10'h3ff<<f_k);
g_dn <= f_k;
end else begin
@ -736,7 +784,8 @@ reg [ 5:0] h_bn; // in range of 0~57
always @ (posedge clk) begin
h_sof <= g_sof & rstn;
{h_bb, h_bn} <= '0;
h_bb <= 57'd0;
h_bn <= 6'd0;
if(~rstn | g_sof) begin
h_eof <= 1'b0;
end else begin
@ -767,42 +816,46 @@ reg [15:0] j_data;
reg[247:0] j_bbuf;
reg [ 7:0] j_bcnt;
reg [247:0] bbuf; // not real register
reg [ 7:0] bcnt; // not real register
always @ (posedge clk) begin
j_sof <= h_sof & rstn;
{j_e, j_data} <= '0;
j_e <= 1'b0;
j_data <= 16'h0;
if(~rstn | h_sof) begin
{j_eof, j_bbuf, j_bcnt} <= '0;
j_eof <= 1'b0;
j_bbuf <= 248'd0;
j_bcnt <= 8'h0;
end else begin
logic [247:0] bbuf;
logic [ 7:0] bcnt;
bbuf = j_bbuf | ({h_bb,191'h0} >> j_bcnt);
bcnt = j_bcnt + {2'd0,h_bn};
if(bcnt >= 8'd16) begin
j_e <= 1'b1;
j_data[15:8] <= bbuf[247:240];
if(bbuf[247:240] == '1) begin
if(bbuf[247:240] == 8'hFF) begin
bbuf = {1'h0, bbuf[239:0], 7'h0};
bcnt -= 8'd7;
bcnt = bcnt - 8'd7;
end else begin
bbuf = { bbuf[239:0], 8'h0};
bcnt -= 8'd8;
bcnt = bcnt - 8'd8;
end
j_data[ 7:0] <= bbuf[247:240];
if(bbuf[247:240] == '1) begin
if(bbuf[247:240] == 8'hFF) begin
bbuf = {1'h0, bbuf[239:0], 7'h0};
bcnt -= 8'd7;
bcnt = bcnt - 8'd7;
end else begin
bbuf = { bbuf[239:0], 8'h0};
bcnt -= 8'd8;
bcnt = bcnt - 8'd8;
end
end else if(h_eof && bcnt > 8'd0) begin
j_e <= 1'b1;
j_data[15:8] <= bbuf[247:240];
if(bbuf[247:240] == '1)
if(bbuf[247:240] == 8'hFF)
j_data[ 7:0] <= {1'b0,bbuf[239:233]};
else
j_data[ 7:0] <= bbuf[239:232];
bbuf = '0;
bbuf = 248'd0;
bcnt = 8'd0;
end
j_bbuf <= bbuf;
@ -816,7 +869,7 @@ end
// make .jls file header and footer
//-------------------------------------------------------------------------------------------------------------------
reg [15:0] jls_wl, jls_hl;
wire[15:0] jls_header [13];
wire[15:0] jls_header [0:12];
assign jls_header[0] = 16'hFFD8;
assign jls_header[1] = 16'h00FF;
assign jls_header[2] = 16'hF700;
@ -831,10 +884,11 @@ assign jls_header[10]= 16'h0101;
assign jls_header[11]= {13'b0,NEAR};
assign jls_header[12]= 16'h0000;
wire[15:0] jls_footer = 16'hFFD9;
always @ (posedge clk)
if(~rstn) begin
jls_wl <= '0;
jls_hl <= '0;
jls_wl <= 16'd0;
jls_hl <= 16'd0;
end else begin
jls_wl <= {2'd0,a_wl} + 16'd1;
jls_hl <= {2'd0,a_hl} + 16'd1;
@ -853,21 +907,21 @@ reg [15:0] k_data;
always @ (posedge clk) begin
k_last <= 1'b0;
k_e <= 1'b0;
k_data <= '0;
k_data <= 16'd0;
if(j_sof) begin
k_footer_i <= '0;
k_footer_i <= 1'b0;
if(k_header_i < 4'd13) begin
k_e <= 1'b1;
k_data <= jls_header[k_header_i];
k_header_i <= k_header_i + 4'd1;
end
end else if(j_e) begin
k_header_i <= '0;
k_footer_i <= '0;
k_header_i <= 4'd0;
k_footer_i <= 1'b0;
k_e <= 1'b1;
k_data <= j_data;
end else if(j_eof) begin
k_header_i <= '0;
k_header_i <= 4'd0;
k_footer_i <= 1'b1;
if(~k_footer_i) begin
k_last <= 1'b1;
@ -875,8 +929,8 @@ always @ (posedge clk) begin
k_data <= jls_footer;
end
end else begin
k_header_i <= '0;
k_footer_i <= '0;
k_header_i <= 4'd0;
k_footer_i <= 1'b0;
end
end
@ -884,7 +938,7 @@ end
//-------------------------------------------------------------------------------------------------------------------
// linebuffer for context pixels
//-------------------------------------------------------------------------------------------------------------------
reg [7:0] linebuffer [1<<14];
reg [7:0] linebuffer [0:(1<<14)-1];
always @ (posedge clk) // line buffer read
c_d <= linebuffer[a_ii];
always @ (posedge clk) // line buffer write

View File

@ -10,10 +10,10 @@
`timescale 1ps/1ps
`define NEAR 1 // NEAR can be 0~7
`define NEAR 1 // NEAR can be 0~7
`define FILE_NO_FIRST 1 // first input file name is test000.pgm
`define FILE_NO_FINAL 8 // final input file name is test000.pgm
`define FILE_NO_FIRST 1 // first input file name is test000.pgm
`define FILE_NO_FINAL 8 // final input file name is test000.pgm
// bubble numbers that insert between pixels
@ -51,30 +51,33 @@ initial begin repeat(4) @(posedge clk); rstn<=1'b1; end
// -------------------------------------------------------------------------------------------------------------------
// signals for jls_encoder_i module
// -------------------------------------------------------------------------------------------------------------------
reg i_sof = '0;
reg [13:0] i_w = '0;
reg [13:0] i_h = '0;
reg i_e = '0;
reg [ 7:0] i_x = '0;
reg i_sof = 0;
reg [13:0] i_w = 0;
reg [13:0] i_h = 0;
reg i_e = 0;
reg [ 7:0] i_x = 0;
wire o_e;
wire[15:0] o_data;
wire o_last;
logic [7:0] img [4096*4096];
int w = 0, h = 0;
reg [7:0] img [4096*4096-1:0];
integer w = 0, h = 0;
task automatic load_img(input logic [256*8:1] fname);
int linelen, depth=0, scanf_num;
logic [256*8-1:0] line;
int fp = $fopen(fname, "rb");
if(fp==0) begin
task load_img;
input [256*8:1] fname;
reg [256*8-1:0] line;
integer linelen, depth, scanf_num, fp, i;
begin
depth = 0;
fp = $fopen(fname, "rb");
if (fp==0) begin
$display("*** error: could not open file %s", fname);
$finish;
end
linelen = $fgets(line, fp);
if(line[8*(linelen-2)+:16] != 16'h5035) begin
if (line[8*(linelen-2)+:16] != 16'h5035) begin
$display("*** error: the first line must be P5");
$fclose(fp);
$finish;
@ -87,17 +90,19 @@ task automatic load_img(input logic [256*8:1] fname);
end
scanf_num = $fgets(line, fp);
scanf_num = $sscanf(line, "%d", depth);
if(depth!=255) begin
if (depth!=255) begin
$display("*** error: images depth must be 255");
$fclose(fp);
$finish;
end
for(int i=0; i<h*w; i++)
for (i=0; i<h*w; i=i+1)
img[i] = $fgetc(fp);
$fclose(fp);
end
endtask
// -------------------------------------------------------------------------------------------------------------------
// task: feed image pixels to jls_encoder_i module
// arguments:
@ -108,20 +113,21 @@ endtask
// when > 0, insert bubble_control bubbles
// when < 0, insert random 0~bubble_control bubbles
// -------------------------------------------------------------------------------------------------------------------
task automatic feed_img(input int bubble_control);
int num_bubble;
task feed_img;
input integer bubble_control;
integer num_bubble, i;
begin
// start feeding a image by assert i_sof for 368 cycles
repeat(368) begin
@(posedge clk)
i_sof <= 1'b1;
i_w <= w - 1;
i_h <= h - 1;
{i_e, i_x} <= '0;
{i_e, i_x} <= 0;
end
// for all pixels of the image
for(int i=0; i<h*w; i++) begin
for(i=0; i<h*w; i=i+1) begin
// calculate how many bubbles to insert
if(bubble_control<0) begin
@ -133,17 +139,18 @@ task automatic feed_img(input int bubble_control);
end
// insert bubbles
repeat(num_bubble) @(posedge clk) {i_sof, i_w, i_h, i_e, i_x} <= '0;
repeat(num_bubble) @(posedge clk) {i_sof, i_w, i_h, i_e, i_x} <= 0;
// assert i_e to input a pixel
@(posedge clk)
{i_sof, i_w, i_h} <= '0;
{i_sof, i_w, i_h} <= 0;
i_e <= 1'b1;
i_x <= img[i];
end
// 16 cycles idle between images
repeat(16) @(posedge clk) {i_sof, i_w, i_h, i_e, i_x} <= '0;
repeat(16) @(posedge clk) {i_sof, i_w, i_h, i_e, i_x} <= 0;
end
endtask
@ -170,10 +177,11 @@ jls_encoder #(
// -------------------------------------------------------------------------------------------------------------------
// read images, feed them to jls_encoder_i module
// -------------------------------------------------------------------------------------------------------------------
int file_no; // file number
integer file_no; // file number
reg [256*8:1] input_file_name;
reg [256*8:1] input_file_format;
initial begin
logic [256*8:1] input_file_name;
logic [256*8:1] input_file_format;
$sformat(input_file_format , "%s\\%s.pgm", `INPUT_PGM_DIR, `FILE_NAME_FORMAT);
while(~rstn) @ (posedge clk);
@ -182,7 +190,7 @@ initial begin
$sformat(input_file_name, input_file_format , file_no);
load_img(input_file_name);
$display("%s (%5dx%5d)", input_file_name, w, h);
$display("%100s (%5dx%5d)", input_file_name, w, h);
if( w < 5 || w > 16384 || h < 1 || h > 16383 ) // image size not supported
$display(" *** image size not supported ***");
@ -202,8 +210,8 @@ end
logic [256*8:1] output_file_format;
initial $sformat(output_file_format, "%s\\%s.jls", `OUTPUT_JLS_DIR, `FILE_NAME_FORMAT);
logic [256*8:1] output_file_name;
int opened = 0;
int jls_file = 0;
integer opened = 0;
integer jls_file = 0;
always @ (posedge clk)
if(o_e) begin

View File

@ -1,5 +1,5 @@
del sim.out dump.vcd
iverilog -g2005-sv -o sim.out tb_jls_encoder.sv ../RTL/jls_encoder.sv
iverilog -g2001 -o sim.out tb_jls_encoder.v ../RTL/jls_encoder.v
vvp -n sim.out
del sim.out
pause