From 47df5b621915255626141ee21272aa2e189857cf Mon Sep 17 00:00:00 2001 From: WangXuan95 <629708558@qq.com> Date: Sat, 3 Jun 2023 21:15:55 +0800 Subject: [PATCH] change to Verilog2001 --- .gitignore | 1 + README.md | 341 ++++++++-------- RTL/{jls_encoder.sv => jls_encoder.v} | 408 +++++++++++--------- SIM/{tb_jls_encoder.sv => tb_jls_encoder.v} | 72 ++-- SIM/tb_jls_encoder_run_iverilog.bat | 2 +- 5 files changed, 459 insertions(+), 365 deletions(-) rename RTL/{jls_encoder.sv => jls_encoder.v} (74%) rename SIM/{tb_jls_encoder.sv => tb_jls_encoder.v} (82%) diff --git a/.gitignore b/.gitignore index 2416432..a86bfc2 100644 --- a/.gitignore +++ b/.gitignore @@ -1,2 +1,3 @@ **/vivado **/quartus +FPGA_jls_encoder_test diff --git a/README.md b/README.md index 17888c3..97e1f85 100644 --- a/README.md +++ b/README.md @@ -1,173 +1,33 @@ -![语言](https://img.shields.io/badge/语言-systemverilog_(IEEE1800_2005)-CAD09D.svg) ![仿真](https://img.shields.io/badge/仿真-iverilog-green.svg) ![部署](https://img.shields.io/badge/部署-quartus-blue.svg) ![部署](https://img.shields.io/badge/部署-vivado-FF1010.svg) - -中文 | [English](#en) - -FPGA JPEG-LS image compressor -=========================== - -基于 **FPGA** 的流式的 **JPEG-LS** 图象压缩器,特点是: - -* 用于压缩 **8bit** 的灰度图像。 -* 可选**无损模式**,即 NEAR=0 。 -* 可选**有损模式**,NEAR=1~7 可调。 -* 图像宽度取值范围为 [5,16384],高度取值范围为 [1,16384]。 -* 极简流式输入输出。 - - - -# 背景知识 - -**JPEG-LS** (简称**JLS**)是一种无损/有损的图像压缩算法,其无损模式的压缩率相当优异,优于 Lossless-JPEG、Lossless-JPEG2000、Lossless-JPEG-XR、FELICES 等。**JPEG-LS** 用压缩前后的像素的最大差值(**NEAR**值)来控制失真,无损模式下 **NEAR=0**;有损模式下**NEAR>0**,**NEAR** 越大,失真越大,压缩率也越大。**JPEG-LS** 压缩图像的文件后缀是 .**jls** 。 - - - -# 使用方法 - -RTL 目录中的 [**jls_encoder.sv**](./RTL/jls_encoder.sv) 是用户可以调用的 JPEG-LS 压缩模块,它输入图像原始像素,输出 JPEG-LS 压缩流。 - -## 模块参数 - -**jls_encoder** 只有一个参数: - -```verilog -parameter logic [2:0] NEAR -``` - -决定了 **NEAR** 值,取值为 3'd0 时,工作在无损模式;取值为 3'd1~3'd7 时,工作在有损模式。 - -## 模块信号 - -**jls_encoder** 的输入输出信号描述如下表。 - -| 信号名称 | 全称 | 方向 | 宽度 | 描述 | -| :---: | :---: | :---: | :---: | :--- | -| rstn | 同步复位 | input | 1bit | 当时钟上升沿时若 rstn=0,模块复位,正常使用时 rstn=1 | -| clk | 时钟 | input | 1bit | 时钟,所有信号都应该于 clk 上升沿对齐。 | -| i_sof | 图像开始 | input | 1bit | 当需要输入一个新的图像时,保持至少368个时钟周期的 i_sof=1 | -| i_w | 图像宽度-1 | input | 14bit | 例如图像宽度为 1920,则 i_w 应该置为 14‘d1919。需要在 i_sof=1 时保持有效。 | -| i_h | 图像高度-1 | input | 14bit | 例如图像宽度为 1080,则 i_h 应该置为 14‘d1079。需要在 i_sof=1 时保持有效。 | -| i_e | 输入像素有效 | input | 1bit | 当 i_e=1 时,一个像素需要被输入到 i_x 上。 | -| i_x | 输入像素 | input | 8bit | 像素取值范围为 8'd0 ~ 8'd255 。 | -| o_e | 输出有效 | output | 1bit | 当 o_e=1 时,输出流数据产生在 o_data 上。 | -| o_data | 输出流数据 | output | 16bit | 大端序,o_data[15:8] 在先;o_data[7:0] 在后。 | -| o_last | 输出流末尾 | output | 1bit | 当 o_e=1 时若 o_last=1 ,说明这是一张图象的输出流的最后一个数据。 | - -> 注:i_w 不能小于 14'd4 。 - -## 输入图片 - -**jls_encoder 模块**的操作的流程是: - -1. **复位**(可选):令 rstn=0 至少 **1 个周期**进行复位,之后正常工作时都保持 rstn=1。实际上也可以不复位(即让 rstn 恒为1)。 -2. **开始**:保持 i_sof=1 **至少 368 个周期**,同时在 i_w 和 i_h 信号上输入图像的宽度和高度,i_sof=1 期间 i_w 和 i_h 要一直保持有效。 -3. **输入**:控制 i_e 和 i_x,从左到右,从上到下地输入该图像的所有像素。当 i_e=1 时,i_x 作为一个像素被输入。 -4. **图像间空闲**:所有像素输入结束后,需要空闲**至少 16 个周期**不做任何动作(即 i_sof=0,i_e=0)。然后才能跳到第2步,开始下一个图像。 - -i_sof=1 和 i_e=1 之间;以及 i_e=1 各自之间可以插入任意个空闲气泡(即, i_sof=0,i_e=0),这意味着我们可以断断续续地输入像素(当然,不插入任何气泡才能达到最高性能)。 - -下图展示了压缩 2 张图像的输入时序图(//代表省略若干周期,X代表don't care)。其中图像 1 在输入第一个像素后插入了 1 个气泡;而图像 2 在 i_sof=1 后插入了 1 个气泡。注意**图像间空闲**必须至少 **16 个周期**。 - - __ __// __ __ __ __ //_ __ // __ __// __ __ __ // __ - clk \__/ \__/ //_/ \__/ \__/ \__/ \__// \__/ \__///\__/ \__/ //_/ \__/ \__/ \__///\__/ \_ - _______//________ // // _______//________ // - i_sof ____/ // \________________//___________//____/ // \___________//________ - _______//________ // // _______//________ // - i_w XXXXX_______//________XXXXXXXXXXXXXXXXX//XXXXXXXXXXX//XXXXX_______//________XXXXXXXXXXXX//XXXXXXXX - _______//________ // // _______//________ // - i_h XXXXX_______//________XXXXXXXXXXXXXXXXX//XXXXXXXXXXX//XXXXX_______//________XXXXXXXXXXXX//XXXXXXXX - // _____ ____//_____ // // _____//____ - i_e ____________//________/ \_____/ // \_____//____________//______________/ // \___ - // _____ ____//_____ // // _____//____ - i_x XXXXXXXXXXXX//XXXXXXXXX_____XXXXXXX____//_____XXXXXX//XXXXXXXXXXXX//XXXXXXXXXXXXXXX_____//____XXXX - - 阶段: | 开始图像1 | 输入图像1 | 图像间空闲 | 开始图像2 | 输入图像2 - -## 输出压缩流 - -在输入过程中,**jls_encoder** 同时会输出压缩好的 **JPEG-LS流**,该流构成了完整的 .jls 文件的内容(包括文件头部和尾部)。o_e=1 时,o_data 是一个有效输出数据。其中,o_data 遵循大端序,即 o_data[15:8] 在流中的位置靠前,o_data[7:0] 在流中的位置靠后。在每个图像的输出流遇到最后一个数据时,o_last=1 指示一张图像的压缩流结束。 - - - -# 仿真 - -仿真相关文件都在 SIM 目录里,包括: - -* tb_jls_encoder.sv 是针对 jls_encoder 的 testbench。行为是:将指定文件夹里的 .pgm 格式的未压缩图像批量送入 jls_encoder 进行压缩,然后将 jls_encoder 的输出结果保存到 .jls 文件里。 -* tb_jls_encoder_run_iverilog.bat 包含了执行 iverilog 仿真的命令。 -* images 文件夹包含几张 .pgm 格式的图像文件。 .pgm 格式存储的是未压缩(也就是存储原始像素)的 8bit 灰度图像,可以使用 photoshop 软件或 Linux 图像查看器就能打开它(Windows图像查看器查看不了它)。 - -> .pgm 文件格式非常简单,只有一个文件头来指示图像的长宽,然后紧接着就存放图像的所有原始像素。因此我选用 .pgm 文件作为仿真的输入文件,因为只需要在 testbench 中简单地编写一些代码就能解析 .pgm 文件,并把其中的像素取出发给 jls_encoder 。不过,你可以不关注 pgm 文件的格式,因为 jls_encoder 的工作与 pgm 格式并没有关系,它只需要接受图像的原始像素作为输入即可。你只需关注仿真的波形,关注图像像素是如何被送入 jls_encoder 中即可。 - -使用 iverilog 进行仿真前,需要安装 iverilog ,见:[iverilog_usage](https://github.com/WangXuan95/WangXuan95/blob/main/iverilog_usage/iverilog_usage.md) - -然后双击 tb_jls_encoder_run_iverilog.bat 就可以运行仿真,该仿真需要运行十几分钟。 - -仿真结束后,你可以看到文件夹中产生了几个 .jls 文件,它们就是压缩得到的图像文件。另外,仿真还产生了波形文件 dump.vcd ,你可以用 gtkwave 打开 dump.vcd 来查看波形。 - -另外,你还可以修改一些仿真参数来进行: - -- 修改 tb_jls_encoder.sv 里的宏名 **NEAR** 来改变压缩率。 -- 修改 tb_jls_encoder.sv 里的宏名 **BUBBLE_CONTROL** 来决定输入相邻的像素间插入多少个气泡: - - **BUBBLE_CONTROL=0** 时,不插入任何气泡。 - - **BUBBLE_CONTROL>0** 时,插入 **BUBBLE_CONTROL **个气泡。 - - **BUBBLE_CONTROL<0** 时,每次插入随机的 **0~(-BUBBLE_CONTROL)** 个气泡 - -> 在不同 NEAR 值和 BUBBLE_CONTROL 值下,本库已经经过了几百张照片的结果对比验证,充分保证无bug。(这部分自动化验证代码就没放上来了) - -## 查看压缩结果 - -因为 **JPEG-LS** 比较小众和专业,大多数图片查看软件无法查看 .jls 文件。 - -你可以试试用[该网站](https://filext.com/file-extension/JLS)来查看 .jls 文件(不过这个网站时常失效)。 - -如果该网站失效,可以用我提供的解压器 decoder.exe 来把它解压回 .pgm 文件再查看。请在 SIM 目录下用 CMD 运行命令: - -```powershell -.\decoder.exe -``` - -例如: - -```powershell -.\decoder.exe test000.jls tmp.pgm -``` - -> 注:decoder.exe 编译自 UBC 提供的 C 语言源码: http://www.stat.columbia.edu/~jakulin/jpeg-ls/mirror.htm - - - - - -# FPGA 部署 - -在 Xilinx Artix-7 xc7a35tcsg324-2 上,综合和实现的结果如下。 - -| LUT | FF | BRAM | 最高时钟频率 | -| :--------: | :------: | :----------------------------: | :----------: | -| 2347 (11%) | 932 (2%) | 9个RAMB18 (9%),等效于 144Kbit | 35 MHz | - -35MHz 下,图像压缩的性能为 35 Mpixel/s ,对 1920x1080 图像的压缩帧率是 16.8fps 。 +![语言](https://img.shields.io/badge/语言-verilog_(IEEE1364_2001)-9A90FD.svg) ![仿真](https://img.shields.io/badge/仿真-iverilog-green.svg) ![部署](https://img.shields.io/badge/部署-quartus-blue.svg) ![部署](https://img.shields.io/badge/部署-vivado-FF1010.svg) +[English](#en) | [中文](#cn) +  FPGA JPEG-LS image compressor =========================== **FPGA** based streaming **JPEG-LS** image compressor, features: +* Pure Verilog design, compatible with various FPGA platforms. * For compressing **8bit** grayscale images. * Support **lossless mode**, i.e. NEAR=0 . * Support **lossy mode**, NEAR=1~7 adjustable. * The value range of image width is [5,16384], and the value range of height is [1,16384]. -* Minimalist streaming input and output. - +* Simple streaming input and output. +  # Background -**JPEG-LS** (abbreviated as **JLS**) is a lossless/lossy image compression algorithm which has the best lossless compression ratio compared to JPEG2000 and JPEG-XR. **JPEG-LS** uses the maximum difference between the pixels before and after compression (**NEAR** value) to control distortion, **NEAR=0** is the lossless mode; **NEAR>0** is the lossy mode, the larger the **NEAR**, the greater the distortion and the greater the compression ratio. The file suffix name for **JPEG-LS** compressed image is .**jls** . +**JPEG-LS** (**JLS**) is a lossless/lossy image compression algorithm which has the best lossless compression ratio compared to PNG, Lossless-JPEG2000, Lossless-WEBP, Lossless-HEIF, etc. **JPEG-LS** uses the maximum difference between the pixels before and after compression (**NEAR** value) to control distortion, **NEAR=0** is the lossless mode; **NEAR>0** is the lossy mode, the larger the **NEAR**, the greater the distortion and the greater the compression ratio. The file suffix name for **JPEG-LS** compressed image is .**jls** . +JPEG-LS has two generations: +- JPEG-LS baseline (ITU-T T.87): JPEG-LS refers to the JPEG-LS baseline by default. **This repo implements the encoder of JPEG-LS baseline**. If you are interested in the software version of JPEG-LS baseline encoder, see https://github.com/WangXuan95/JPEG-LS (C language) +- JPEG-LS extension (ITU-T T.870): Its compression ratio is higher than JPEG-LS baseline, but it is very rarely (even no code can be found online). **This repo is not about JPEG-LS extension**. However, I have a C implemented of JPEG-LS extension, see https://github.com/WangXuan95/JPEG-LS_extension + +  # Module Usage @@ -234,7 +94,7 @@ The following figure shows the input timing diagram of compressing 2 images (//r During the input, **jls_encoder** will also output a compressed **JPEG-LS stream**, which constitutes the content of the complete .jls file (including the file header and trailer). When `o_e=1`, `o_data` is a valid output data. Among them, `o_data` follows the big endian order, that is, `o_data[15:8]` is at the front of the stream, and `o_data[7:0]` is at the back of the stream. `o_last=1` indicates the end of the compressed stream for an image when the output stream for each image encounters the last data. - +  # RTL Simulation @@ -260,7 +120,7 @@ In addition, you can also modify some simulation parameters: - When **BUBBLE_CONTROL>0**, insert **BUBBLE_CONTROL ** bubbles. - When **BUBBLE_CONTROL<0**, insert random **0~(-BUBBLE_CONTROL)** bubbles each time. - +  ## View compressed JLS file @@ -282,7 +142,7 @@ For example: > Note: decoder.exe is compiled from the C language source code provided by UBC : http://www.stat.columbia.edu/~jakulin/jpeg-ls/mirror.htm - +  # FPGA Deployment @@ -294,3 +154,174 @@ On Xilinx Artix-7 xc7a35tcsg324-2, the synthesized and implemented results are a At 35MHz, the image compression performance is 35 Mpixel/s, which means the compression frame rate for 1920x1080 images is 16.8fps. +  + +# Reference + +- ITU-T T.87 : Information technology – Lossless and near-lossless compression of continuous-tone still images – Baseline : https://www.itu.int/rec/T-REC-T.87/en +- UBC's JPEG-LS baseline Public Domain Code : http://www.stat.columbia.edu/~jakulin/jpeg-ls/mirror.htm +- Simple JPEG-LS baseline encoder in C language : https://github.com/WangXuan95/JPEG-LS + +  + +  + +  + +FPGA JPEG-LS image compressor +=========================== + +基于 **FPGA** 的流式的 **JPEG-LS** 图像压缩器,特点是: + +* 纯 Verilog 设计,可在各种FPGA型号上部署 +* 用于压缩 **8bit** 的灰度图像。 +* 可选**无损模式**,即 NEAR=0 。 +* 可选**有损模式**,NEAR=1~7 可调。 +* 图像宽度取值范围为 [5,16384],高度取值范围为 [1,16384]。 +* 极简流式输入输出。 + +  + +# 背景知识 + +**JPEG-LS** (简称**JLS**)是一种无损/有损的图像压缩算法,其无损模式的压缩率相当优异,优于 PNG、Lossless-JPEG2000、Lossless-WEBP、Lossless-HEIF 等。**JPEG-LS** 用压缩前后的像素的最大差值(**NEAR**值)来控制失真,无损模式下 **NEAR=0**;有损模式下**NEAR>0**,**NEAR** 越大,失真越大,压缩率也越大。**JPEG-LS** 压缩图像的文件后缀是 .**jls** 。 + +JPEG-LS 有两代: + +- JPEG-LS baseline (ITU-T T.87) : 一般提到 JPEG-LS 默认都是指 JPEG-LS baseline。**本库也实现的是 JPEG-LS baseline 的 encoder** 。如果你对软件版本的 JPEG-LS baseline encoder 感兴趣,可以看 https://github.com/WangXuan95/JPEG-LS (C语言实现) +- JPEG-LS extension (ITU-T T.870) : 其压缩率高于 JPEG-LS baseline ,但使用的非常少 (在网上搜不到任何代码) 。**本库与 JPEG-LS extension 无关!**不过我依照 ITU-T T.870 实现了 C 语言的 JPEG-LS extension,见 https://github.com/WangXuan95/JPEG-LS_extension + +  + +# 使用方法 + +RTL 目录中的 [**jls_encoder.sv**](./RTL/jls_encoder.sv) 是用户可以调用的 JPEG-LS 压缩模块,它输入图像原始像素,输出 JPEG-LS 压缩流。 + +## 模块参数 + +**jls_encoder** 只有一个参数: + +```verilog +parameter logic [2:0] NEAR +``` + +决定了 **NEAR** 值,取值为 3'd0 时,工作在无损模式;取值为 3'd1~3'd7 时,工作在有损模式。 + +## 模块信号 + +**jls_encoder** 的输入输出信号描述如下表。 + +| 信号名称 | 全称 | 方向 | 宽度 | 描述 | +| :---: | :---: | :---: | :---: | :--- | +| rstn | 同步复位 | input | 1bit | 当时钟上升沿时若 rstn=0,模块复位,正常使用时 rstn=1 | +| clk | 时钟 | input | 1bit | 时钟,所有信号都应该于 clk 上升沿对齐。 | +| i_sof | 图像开始 | input | 1bit | 当需要输入一个新的图像时,保持至少368个时钟周期的 i_sof=1 | +| i_w | 图像宽度-1 | input | 14bit | 例如图像宽度为 1920,则 i_w 应该置为 14‘d1919。需要在 i_sof=1 时保持有效。 | +| i_h | 图像高度-1 | input | 14bit | 例如图像宽度为 1080,则 i_h 应该置为 14‘d1079。需要在 i_sof=1 时保持有效。 | +| i_e | 输入像素有效 | input | 1bit | 当 i_e=1 时,一个像素需要被输入到 i_x 上。 | +| i_x | 输入像素 | input | 8bit | 像素取值范围为 8'd0 ~ 8'd255 。 | +| o_e | 输出有效 | output | 1bit | 当 o_e=1 时,输出流数据产生在 o_data 上。 | +| o_data | 输出流数据 | output | 16bit | 大端序,o_data[15:8] 在先;o_data[7:0] 在后。 | +| o_last | 输出流末尾 | output | 1bit | 当 o_e=1 时若 o_last=1 ,说明这是一张图像的输出流的最后一个数据。 | + +> 注:i_w 不能小于 14'd4 。 + +## 输入图片 + +**jls_encoder 模块**的操作的流程是: + +1. **复位**(可选):令 rstn=0 至少 **1 个周期**进行复位,之后正常工作时都保持 rstn=1。实际上也可以不复位(即让 rstn 恒为1)。 +2. **开始**:保持 i_sof=1 **至少 368 个周期**,同时在 i_w 和 i_h 信号上输入图像的宽度和高度,i_sof=1 期间 i_w 和 i_h 要一直保持有效。 +3. **输入**:控制 i_e 和 i_x,从左到右,从上到下地输入该图像的所有像素。当 i_e=1 时,i_x 作为一个像素被输入。 +4. **图像间空闲**:所有像素输入结束后,需要空闲**至少 16 个周期**不做任何动作(即 i_sof=0,i_e=0)。然后才能跳到第2步,开始下一个图像。 + +i_sof=1 和 i_e=1 之间;以及 i_e=1 各自之间可以插入任意个空闲气泡(即, i_sof=0,i_e=0),这意味着我们可以断断续续地输入像素(当然,不插入任何气泡才能达到最高性能)。 + +下图展示了压缩 2 张图像的输入时序图(//代表省略若干周期,X代表don't care)。其中图像 1 在输入第一个像素后插入了 1 个气泡;而图像 2 在 i_sof=1 后插入了 1 个气泡。注意**图像间空闲**必须至少 **16 个周期**。 + + __ __// __ __ __ __ //_ __ // __ __// __ __ __ // __ + clk \__/ \__/ //_/ \__/ \__/ \__/ \__// \__/ \__///\__/ \__/ //_/ \__/ \__/ \__///\__/ \_ + _______//________ // // _______//________ // + i_sof ____/ // \________________//___________//____/ // \___________//________ + _______//________ // // _______//________ // + i_w XXXXX_______//________XXXXXXXXXXXXXXXXX//XXXXXXXXXXX//XXXXX_______//________XXXXXXXXXXXX//XXXXXXXX + _______//________ // // _______//________ // + i_h XXXXX_______//________XXXXXXXXXXXXXXXXX//XXXXXXXXXXX//XXXXX_______//________XXXXXXXXXXXX//XXXXXXXX + // _____ ____//_____ // // _____//____ + i_e ____________//________/ \_____/ // \_____//____________//______________/ // \___ + // _____ ____//_____ // // _____//____ + i_x XXXXXXXXXXXX//XXXXXXXXX_____XXXXXXX____//_____XXXXXX//XXXXXXXXXXXX//XXXXXXXXXXXXXXX_____//____XXXX + + 阶段: | 开始图像1 | 输入图像1 | 图像间空闲 | 开始图像2 | 输入图像2 + +## 输出压缩流 + +在输入过程中,**jls_encoder** 同时会输出压缩好的 **JPEG-LS流**,该流构成了完整的 .jls 文件的内容(包括文件头部和尾部)。o_e=1 时,o_data 是一个有效输出数据。其中,o_data 遵循大端序,即 o_data[15:8] 在流中的位置靠前,o_data[7:0] 在流中的位置靠后。在每个图像的输出流遇到最后一个数据时,o_last=1 指示一张图像的压缩流结束。 + +  + +# 仿真 + +仿真相关文件都在 SIM 目录里,包括: + +* tb_jls_encoder.sv 是针对 jls_encoder 的 testbench。行为是:将指定文件夹里的 .pgm 格式的未压缩图像批量送入 jls_encoder 进行压缩,然后将 jls_encoder 的输出结果保存到 .jls 文件里。 +* tb_jls_encoder_run_iverilog.bat 包含了执行 iverilog 仿真的命令。 +* images 文件夹包含几张 .pgm 格式的图像文件。 .pgm 格式存储的是未压缩(也就是存储原始像素)的 8bit 灰度图像,可以使用 photoshop 软件或 Linux 图像查看器就能打开它(Windows图像查看器查看不了它)。 + +> .pgm 文件格式非常简单,只有一个文件头来指示图像的长宽,然后紧接着就存放图像的所有原始像素。因此我选用 .pgm 文件作为仿真的输入文件,因为只需要在 testbench 中简单地编写一些代码就能解析 .pgm 文件,并把其中的像素取出发给 jls_encoder 。不过,你可以不关注 pgm 文件的格式,因为 jls_encoder 的工作与 pgm 格式并没有关系,它只需要接受图像的原始像素作为输入即可。你只需关注仿真的波形,关注图像像素是如何被送入 jls_encoder 中即可。 + +使用 iverilog 进行仿真前,需要安装 iverilog ,见:[iverilog_usage](https://github.com/WangXuan95/WangXuan95/blob/main/iverilog_usage/iverilog_usage.md) + +然后双击 tb_jls_encoder_run_iverilog.bat 就可以运行仿真,该仿真需要运行十几分钟。 + +仿真结束后,你可以看到文件夹中产生了几个 .jls 文件,它们就是压缩得到的图像文件。另外,仿真还产生了波形文件 dump.vcd ,你可以用 gtkwave 打开 dump.vcd 来查看波形。 + +另外,你还可以修改一些仿真参数来进行: + +- 修改 tb_jls_encoder.sv 里的宏名 **NEAR** 来改变压缩率。 +- 修改 tb_jls_encoder.sv 里的宏名 **BUBBLE_CONTROL** 来决定输入相邻的像素间插入多少个气泡: + - **BUBBLE_CONTROL=0** 时,不插入任何气泡。 + - **BUBBLE_CONTROL>0** 时,插入 **BUBBLE_CONTROL **个气泡。 + - **BUBBLE_CONTROL<0** 时,每次插入随机的 **0~(-BUBBLE_CONTROL)** 个气泡 + +> 在不同 NEAR 值和 BUBBLE_CONTROL 值下,本库已经经过了几百张照片的结果对比验证,充分保证无bug。(这部分自动化验证代码就没放上来了) + +## 查看压缩结果 + +因为 **JPEG-LS** 比较小众和专业,大多数图片查看软件无法查看 .jls 文件。 + +你可以试试用[该网站](https://filext.com/file-extension/JLS)来查看 .jls 文件(不过这个网站时常失效)。 + +如果该网站失效,可以用我提供的解压器 decoder.exe 来把它解压回 .pgm 文件再查看。请在 SIM 目录下用 CMD 运行命令: + +```powershell +.\decoder.exe +``` + +例如: + +```powershell +.\decoder.exe test000.jls tmp.pgm +``` + +> 注:decoder.exe 编译自 UBC 提供的 C 语言源码: http://www.stat.columbia.edu/~jakulin/jpeg-ls/mirror.htm + +  + +# FPGA 部署 + +在 Xilinx Artix-7 xc7a35tcsg324-2 上,综合和实现的结果如下。 + +| LUT | FF | BRAM | 最高时钟频率 | +| :--------: | :------: | :----------------------------: | :----------: | +| 2347 (11%) | 932 (2%) | 9个RAMB18 (9%),等效于 144Kbit | 35 MHz | + +35MHz 下,图像压缩的性能为 35 Mpixel/s ,对 1920x1080 图像的压缩帧率是 16.8fps 。 + +  + +# 相关链接 + +- ITU-T T.87 : Information technology – Lossless and near-lossless compression of continuous-tone still images – Baseline : https://www.itu.int/rec/T-REC-T.87/en +- UBC's JPEG-LS baseline Public Domain Code : http://www.stat.columbia.edu/~jakulin/jpeg-ls/mirror.htm +- 精简的 JPEG-LS baseline 编码器 (C语言) : https://github.com/WangXuan95/JPEG-LS diff --git a/RTL/jls_encoder.sv b/RTL/jls_encoder.v similarity index 74% rename from RTL/jls_encoder.sv rename to RTL/jls_encoder.v index 574a08a..ee170f6 100644 --- a/RTL/jls_encoder.sv +++ b/RTL/jls_encoder.v @@ -2,12 +2,12 @@ //-------------------------------------------------------------------------------------------------------- // Module : jls_encoder // Type : synthesizable, IP's top -// Standard: SystemVerilog 2005 (IEEE1800-2005) +// Standard: Verilog 2001 (IEEE1364-2001) // Function: JPEG-LS image compressor //-------------------------------------------------------------------------------------------------------- module jls_encoder #( - parameter [2:0] NEAR = 3'd1 + parameter [ 2:0] NEAR = 3'd0 ) ( input wire rstn, input wire clk, @@ -21,10 +21,12 @@ module jls_encoder #( output wire o_last // indicate the last output data of a image ); + + //--------------------------------------------------------------------------------------------------------------------------- // local parameters //--------------------------------------------------------------------------------------------------------------------------- -wire [3:0] P_QBPPS [8]; +wire [3:0] P_QBPPS [0:7]; assign P_QBPPS[0] = 4'd8; assign P_QBPPS[1] = 4'd7; assign P_QBPPS[2] = 4'd6; @@ -34,19 +36,19 @@ assign P_QBPPS[5] = 4'd5; assign P_QBPPS[6] = 4'd5; assign P_QBPPS[7] = 4'd5; -localparam logic P_LOSSY = NEAR != '0; -localparam logic signed [8:0] P_NEAR = $signed({6'd0, NEAR}); -localparam logic signed [8:0] P_T1 = $signed(9'd3) + $signed(9'd3) * P_NEAR; -localparam logic signed [8:0] P_T2 = $signed(9'd7) + $signed(9'd5) * P_NEAR; -localparam logic signed [8:0] P_T3 = $signed(9'd21)+ $signed(9'd7) * P_NEAR; -localparam logic signed [9:0] P_QUANT = {P_NEAR, 1'b1}; -localparam logic signed [9:0] P_QBETA = $signed(10'd256 + {5'd0,NEAR,2'd0}) / P_QUANT; -localparam logic signed [9:0] P_QBETAHALF = (P_QBETA+$signed(10'd1)) / $signed(10'd2); -wire [3:0] P_QBPP = P_QBPPS[NEAR]; -wire [4:0] P_LIMIT = 5'd31 - {1'b0, P_QBPP}; -localparam logic [12:0] P_AINIT = (NEAR=='0) ? 13'd4 : 13'd2; +localparam P_LOSSY = (NEAR != 3'd0); +localparam signed [8:0] P_NEAR = $signed({6'd0, NEAR}); +localparam signed [8:0] P_T1 = $signed(9'd3) + $signed(9'd3) * P_NEAR; +localparam signed [8:0] P_T2 = $signed(9'd7) + $signed(9'd5) * P_NEAR; +localparam signed [8:0] P_T3 = $signed(9'd21)+ $signed(9'd7) * P_NEAR; +localparam signed [9:0] P_QUANT = {P_NEAR, 1'b1}; +localparam signed [9:0] P_QBETA = $signed(10'd256 + {5'd0,NEAR,2'd0}) / P_QUANT; +localparam signed [9:0] P_QBETAHALF = (P_QBETA+$signed(10'd1)) / $signed(10'd2); +wire [3:0] P_QBPP = P_QBPPS[NEAR]; +wire [4:0] P_LIMIT = 5'd31 - {1'b0, P_QBPP}; +localparam [12:0] P_AINIT = (NEAR == 3'd0) ? 13'd4 : 13'd2; -wire [3:0] J [32]; +wire [3:0] J [0:31]; assign J[ 0] = 4'd0; assign J[ 1] = 4'd0; assign J[ 2] = 4'd0; @@ -85,188 +87,232 @@ assign J[31] = 4'd15; //--------------------------------------------------------------------------------------------------------------------------- // function: is_near //--------------------------------------------------------------------------------------------------------------------------- -function automatic logic func_is_near(input [7:0] x1, input [7:0] x2); - logic signed [8:0] ex1, ex2; +function [0:0] func_is_near; + input [7:0] x1, x2; + reg signed [8:0] ex1, ex2; +begin ex1 = $signed({1'b0,x1}); ex2 = $signed({1'b0,x2}); - return ex1 - ex2 <= P_NEAR && ex2 - ex1 <= P_NEAR; + func_is_near = ((ex1 - ex2 <= P_NEAR) && (ex2 - ex1 <= P_NEAR)); +end endfunction //--------------------------------------------------------------------------------------------------------------------------- // function: predictor (get_px) //--------------------------------------------------------------------------------------------------------------------------- -function automatic logic [7:0] func_predictor(input [7:0] a, input [7:0] b, input [7:0] c); +function [7:0] func_predictor; + input [7:0] a, b, c; +begin if( c>=a && c>=b ) - return a>b ? b : a; + func_predictor = (a>b) ? b : a; else if( c<=a && c<=b ) - return a>b ? a : b; + func_predictor = (a>b) ? a : b; else - return a - c + b; + func_predictor = a - c + b; +end endfunction //--------------------------------------------------------------------------------------------------------------------------- // function: q_quantize //--------------------------------------------------------------------------------------------------------------------------- -function automatic logic signed [3:0] func_q_quantize(input [7:0] x1, input [7:0] x2); - logic signed [8:0] delta; +function signed [3:0] func_q_quantize; + input [7:0] x1, x2; + reg signed [8:0] delta; +begin delta = $signed({1'b0,x1}) - $signed({1'b0,x2}); if (delta <= -P_T3 ) - return -$signed(4'd4); + func_q_quantize = -$signed(4'd4); else if(delta <= -P_T2 ) - return -$signed(4'd3); + func_q_quantize = -$signed(4'd3); else if(delta <= -P_T1 ) - return -$signed(4'd2); + func_q_quantize = -$signed(4'd2); else if(delta < -P_NEAR ) - return -$signed(4'd1); + func_q_quantize = -$signed(4'd1); else if(delta <= P_NEAR ) - return $signed(4'd0); + func_q_quantize = $signed(4'd0); else if(delta < P_T1 ) - return $signed(4'd1); + func_q_quantize = $signed(4'd1); else if(delta < P_T2 ) - return $signed(4'd2); + func_q_quantize = $signed(4'd2); else if(delta < P_T3 ) - return $signed(4'd3); + func_q_quantize = $signed(4'd3); else - return $signed(4'd4); + func_q_quantize = $signed(4'd4); +end endfunction //--------------------------------------------------------------------------------------------------------------------------- // function: get_q (part 1), qp1 = 81*Q(d-b) + 9*Q(b-c) //--------------------------------------------------------------------------------------------------------------------------- -function automatic logic signed [9:0] func_get_qp1(input [7:0] c, input [7:0] b, input [7:0] d); - return $signed(10'd81) * func_q_quantize(d,b) + $signed(10'd9) * func_q_quantize(b,c); +function signed [9:0] func_get_qp1; + input [7:0] c, b, d; +begin + func_get_qp1 = $signed(10'd81) * func_q_quantize(d,b) + $signed(10'd9) * func_q_quantize(b,c); +end endfunction //--------------------------------------------------------------------------------------------------------------------------- // function: get_q (part 2), get sign(qs) and abs(qs), where qs = qp1 + Q(c-a) //--------------------------------------------------------------------------------------------------------------------------- -function automatic logic [9:0] func_get_q(input signed [9:0] qp1, input [7:0] c, input [7:0] a); - logic signed [9:0] qs; - logic s; - logic [8:0] q; +function [9:0] func_get_q; + input signed [9:0] qp1; + input [7:0] c, a; + reg signed [9:0] qs; + reg s; + reg [8:0] q; +begin qs = qp1 + func_q_quantize(c,a); s = qs[9]; q = s ? (~qs[8:0]+9'd1) : qs[8:0]; - return {s, q}; + func_get_q = {s, q}; +end endfunction //--------------------------------------------------------------------------------------------------------------------------- // function: clip //--------------------------------------------------------------------------------------------------------------------------- -function automatic logic [7:0] func_clip(input signed [9:0] val); +function [7:0] func_clip; + input signed [9:0] val; +begin if( val > $signed(10'd255) ) - return 8'd255; + func_clip = 8'd255; else if( val < $signed(10'd0) ) - return 8'd0; + func_clip = 8'd0; else - return val[7:0]; + func_clip = val[7:0]; +end endfunction //--------------------------------------------------------------------------------------------------------------------------- // function: errval_quantize //--------------------------------------------------------------------------------------------------------------------------- -function automatic logic signed [9:0] func_errval_quantize(input signed [9:0] err); +function signed [9:0] func_errval_quantize; + input signed [9:0] err; +begin if(err[9]) - return -( (P_NEAR - err) / P_QUANT ); + func_errval_quantize = -( (P_NEAR - err) / P_QUANT ); else - return (P_NEAR + err) / P_QUANT; + func_errval_quantize = (P_NEAR + err) / P_QUANT; +end endfunction //--------------------------------------------------------------------------------------------------------------------------- // function: modrange //--------------------------------------------------------------------------------------------------------------------------- -function automatic logic signed [9:0] func_modrange(input signed [9:0] val); - logic signed [9:0] new_val; - new_val = val; - if( new_val[9] ) - new_val += P_QBETA; - if( new_val >= P_QBETAHALF ) - new_val -= P_QBETA; - return new_val; +function signed [9:0] func_modrange; + input signed [9:0] val; +begin + func_modrange = val; + if( func_modrange[9] ) + func_modrange = func_modrange + P_QBETA; + if( func_modrange >= P_QBETAHALF ) + func_modrange = func_modrange - P_QBETA; +end endfunction //--------------------------------------------------------------------------------------------------------------------------- // function: get k //--------------------------------------------------------------------------------------------------------------------------- -function automatic logic [3:0] func_get_k(input [12:0] A, input [6:0] N, input rt); - logic [18:0] Nt, At; - logic [ 3:0] k; +function [ 3:0] func_get_k; + input [12:0] A; + input [ 6:0] N; + input rt; + reg [18:0] Nt, At; + reg [ 3:0] ii; +begin Nt = {12'h0, N}; At = { 6'h0, A}; - k = 4'd0; - if(rt) - At += {13'd0, N[6:1]}; - for(int ii=0; ii<13; ii++) + func_get_k = 4'd0; + if (rt) + At = At + {13'd0, N[6:1]}; + for (ii=4'd0; ii<4'd13; ii=ii+4'd1) if((Nt<>>= 1; + if (errm0) + B_update = B_update + 7'd1; + if (reset) + B_update = B_update >>> 1; +end endfunction //--------------------------------------------------------------------------------------------------------------------------- // function: C, B update for regular mode //--------------------------------------------------------------------------------------------------------------------------- -function automatic logic [14:0] C_B_update(input reset, input [6:0] N, input signed [7:0] C, input signed [6:0] B, input signed [9:0] err); - logic signed [9:0] Bt; - logic signed [7:0] Ct; +function [14:0] C_B_update; + input reset; + input [6:0] N; + input signed [7:0] C; + input signed [6:0] B; + input signed [9:0] err; + reg signed [9:0] Bt; + reg signed [7:0] Ct; +begin Bt = B; Ct = C; - Bt += err * P_QUANT; + Bt = Bt + (err * P_QUANT); if(reset) - Bt >>>= 1; + Bt = Bt >>> 1; if( Bt <= -$signed({3'd0,N}) ) begin - Bt += $signed({3'd0,N}); + Bt = Bt + $signed({3'd0,N}); if( Bt <= -$signed({3'd0,N}) ) Bt = -$signed({3'd0,N}-10'd1); if( Ct != $signed(8'd128) ) - Ct--; + Ct = Ct - 8'd1; end else if( Bt > $signed(10'd0) ) begin - Bt -= $signed({3'd0,N}); + Bt = Bt - $signed({3'd0,N}); if( Bt > $signed(10'd0) ) Bt = $signed(10'd0); if( Ct != $signed(8'd127) ) - Ct++; + Ct = Ct + 8'd1; end - return {Ct, Bt[6:0]}; + C_B_update = {Ct, Bt[6:0]}; +end endfunction //--------------------------------------------------------------------------------------------------------------------------- // function: A update //--------------------------------------------------------------------------------------------------------------------------- -function automatic logic [12:0] A_update(input reset, input [12:0] A, input [9:0] inc); +function [12:0] A_update; + input reset; + input [12:0] A; + input [ 9:0] inc; +begin A_update = A + {3'd0, inc}; if(reset) - A_update >>>= 1; + A_update = A_update >>> 1; +end endfunction //------------------------------------------------------------------------------------------------------------------- // context memorys //------------------------------------------------------------------------------------------------------------------- -reg [ 5:0] Nram [366]; -reg [12:0] Aram [366]; -reg signed [ 6:0] Bram [366]; +reg [ 5:0] Nram [0:365]; +reg [12:0] Aram [0:365]; +reg signed [ 6:0] Bram [0:365]; reg signed [ 7:0] Cram [1:364]; @@ -285,7 +331,7 @@ reg [14:0] a_jj; always @ (posedge clk) if(~rstn) begin - {a_sof, a_e, a_x, a_w, a_h, a_wl, a_hl, a_ii, a_jj} <= '0; + {a_sof, a_e, a_x, a_w, a_h, a_wl, a_hl, a_ii, a_jj} <= 0; end else begin a_sof <= i_sof; a_e <= i_e; @@ -295,13 +341,13 @@ always @ (posedge clk) if(a_sof) begin a_wl <= (a_w<14'd4 ? 14'd4 : a_w); a_hl <= a_h; - a_ii <= '0; - a_jj <= '0; + a_ii <= 14'd0; + a_jj <= 15'd0; end else if(a_e) begin if(a_ii < a_wl) a_ii <= a_ii + 14'd1; else begin - a_ii <= '0; + a_ii <= 14'd0; if(a_jj <= {1'b0,a_hl}) a_jj <= a_jj + 15'd1; end @@ -324,12 +370,12 @@ reg [ 7:0] b_x; always @ (posedge clk) begin b_sof <= a_sof & rstn; if(~rstn | a_sof) begin - {b_e, b_fc, b_lc, b_fr, b_eof, b_ii, b_x} <= '0; + {b_e, b_fc, b_lc, b_fr, b_eof, b_ii, b_x} <= 0; end else begin b_e <= a_e & (a_jj <= {1'b0,a_hl}); - b_fc <= a_e & (a_ii == '0); + b_fc <= a_e & (a_ii == 14'd0); b_lc <= a_e & (a_ii == a_wl); - b_fr <= a_e & (a_jj == '0); + b_fr <= a_e & (a_jj == 15'd0); b_eof <= a_jj > {1'b0,a_hl}; b_ii <= a_ii; b_x <= a_x; @@ -356,7 +402,7 @@ reg [ 7:0] c_d; always @ (posedge clk) begin c_sof <= b_sof & rstn; if(~rstn | b_sof) begin - {c_e,c_fc,c_lc,c_fr,c_eof,c_ii,c_x,c_b,c_bt,c_c} <= '0; + {c_e,c_fc,c_lc,c_fr,c_eof,c_ii,c_x,c_b,c_bt,c_c} <= 0; end else begin c_e <= b_e; c_fc <= b_fc; @@ -366,10 +412,10 @@ always @ (posedge clk) begin c_ii <= b_ii; if(b_e) begin c_x <= b_x; - c_b <= b_fr ? '0 : c_d; + c_b <= b_fr ? 8'd0 : c_d; if(b_fr) begin - c_bt <= '0; - c_c <= '0; + c_bt <= 8'd0; + c_c <= 8'd0; end else if(b_fc) begin c_bt <= c_d; c_c <= c_bt; @@ -394,12 +440,13 @@ reg [ 7:0] d_b; reg [ 7:0] d_c; reg signed [9:0] d_qp1; +wire [7:0] c_w_d = c_fr ? 8'd0 : (c_lc ? c_b : c_d); + always @ (posedge clk) begin d_sof <= c_sof & rstn; if(~rstn | c_sof) begin - {d_e, d_fc, d_lc, d_eof, d_ii, d_x, d_b, d_c, d_qp1} <= '0; + {d_e, d_fc, d_lc, d_eof, d_ii, d_x, d_b, d_c, d_qp1} <= 0; end else begin - logic [7:0] d; d_e <= c_e; d_fc <= c_fc; d_lc <= c_lc; @@ -408,8 +455,7 @@ always @ (posedge clk) begin d_x <= c_x; d_b <= c_b; d_c <= c_c; - d = c_fr ? '0 : (c_lc ? c_b : c_d); - d_qp1 <= func_get_qp1(c_c, c_b, d); + d_qp1 <= func_get_qp1(c_c, c_b, c_w_d); end end @@ -436,48 +482,49 @@ reg [5:0] e_Nn; reg signed [7:0] e_Cn; reg signed [6:0] e_Bn; +wire [7:0] d_w_a = d_fc ? d_b : e_x; + +reg s; // not real register +reg [8:0] q; // not real register +reg rt; // not real register +reg runi; // not real register +reg rune; // not real register +reg signed [7:0] Co; // not real register +reg [6:0] No, Nn; // not real register +reg signed [6:0] Bo; // not real register +reg signed [9:0] px; // not real register +reg signed [9:0] err; // not real register + always @ (posedge clk) begin e_sof <= d_sof & rstn; e_2BleN <= 1'b0; - {e_write_C, e_Cn, e_write_en, e_Bn, e_Nn} <= '0; + {e_write_C, e_Cn, e_write_en, e_Bn, e_Nn} <= 0; if(~rstn | d_sof) begin - {e_e, e_fc, e_lc, e_eof, e_ii, e_runi, e_rune, e_x, e_q, e_rt, e_err, e_No} <= '0; + {e_e, e_fc, e_lc, e_eof, e_ii, e_runi, e_rune, e_x, e_q, e_rt, e_err, e_No} <= 0; end else begin - logic [7:0] a; - logic s; - logic [8:0] q; - logic rt; - logic runi; - logic rune; - logic signed [7:0] Co; - logic [6:0] No, Nn; - logic signed [6:0] Bo; - logic signed [9:0] px; - logic signed [9:0] err; - a = d_fc ? d_b : e_x; rt = 1'b0; rune = 1'b0; - No = '0; - err = '0; - {s, q} = func_get_q(d_qp1, d_c, a); + No = 0; + err = 0; + {s, q} = func_get_q(d_qp1, d_c, d_w_a); Co = (e_write_C & e_q==q) ? e_Cn : Cram[q]; runi = ~d_fc & e_runi | (q == 9'd0); if(runi) begin - runi = func_is_near(d_x, a); + runi = func_is_near(d_x, d_w_a); rune = ~runi; end if(d_e) begin if(runi) begin - e_x <= P_LOSSY ? a : d_x; + e_x <= P_LOSSY ? d_w_a : d_x; end else begin if(rune) begin - rt = func_is_near(d_b, a); - s = {1'b0,a} > ({1'b0,d_b} + {6'd0,NEAR}) ? 1'b1 : 1'b0; + rt = func_is_near(d_b, d_w_a); + s = {1'b0,d_w_a} > ({1'b0,d_b} + {6'd0,NEAR}) ? 1'b1 : 1'b0; q = rt ? 9'd365 : 9'd0; - px = rt ? a : d_b; + px = rt ? d_w_a : d_b; end else begin px[9:8] = 2'b00; - px[7:0] = func_clip( $signed({2'h0, func_predictor(a,d_b,d_c)}) + ( s ? -$signed({Co[7],Co[7],Co}) : $signed({Co[7],Co[7],Co}) ) ); + px[7:0] = func_clip( $signed({2'h0, func_predictor(d_w_a, d_b, d_c)}) + ( s ? -$signed({Co[7],Co[7],Co}) : $signed({Co[7],Co[7],Co}) ) ); end err = s ? px - $signed({2'd0, d_x}) : $signed({2'd0, d_x}) - px; err = func_errval_quantize(err); @@ -485,7 +532,7 @@ always @ (posedge clk) begin err = func_modrange(err); No = ((e_write_en & e_q==q) ? e_Nn : Nram[q]) + 7'd1; Nn = No; - if(No[6]) Nn >>>= 1; + if(No[6]) Nn = Nn >>> 1; e_Nn <= Nn[5:0]; Bo = (e_write_en & e_q==q) ? e_Bn : Bram[q]; e_write_en <= 1'b1; @@ -518,6 +565,7 @@ end // pipeline stage f: write Cram, Bram, Nram //------------------------------------------------------------------------------------------------------------------- reg [8:0] NBC_init_addr; + always @ (posedge clk) NBC_init_addr <= e_sof ? NBC_init_addr + (NBC_init_addr < 9'd366 ? 9'd1 : 9'd0) : 9'd0; @@ -556,7 +604,7 @@ reg ef_write_en; always @ (posedge clk) begin ef_sof <= e_sof & rstn; if(~rstn | e_sof) begin - {ef_e, ef_fc, ef_lc, ef_eof, ef_runi, ef_rune, ef_2BleN, ef_q, ef_rt, ef_err, ef_No, ef_write_en} <= '0; + {ef_e, ef_fc, ef_lc, ef_eof, ef_runi, ef_rune, ef_2BleN, ef_q, ef_rt, ef_err, ef_No, ef_write_en} <= 0; end else begin ef_e <= e_e; ef_fc <= e_fc; @@ -603,50 +651,50 @@ reg g_write_en; reg [12:0] g_An; always @ (posedge clk) if(~rstn | f_sof) - {g_q, g_write_en, g_An} <= '0; + {g_q, g_write_en, g_An} <= 0; else {g_q, g_write_en, g_An} <= {f_q, f_write_en, f_An}; +reg [ 1:0] on; // not real register +reg [15:0] rc; // not real register +reg [ 4:0] ri; // not real register +reg [12:0] Ao; // not real register +reg [ 3:0] k; // not real register +reg [ 9:0] abserr; // not real register +reg [ 9:0] merr, Ainc; // not real register +reg map; // not real register always @ (posedge clk) begin f_sof <= ef_sof & rstn; f_limit <= P_LIMIT; f_An <= P_AINIT; if(~rstn | ef_sof) begin - {f_e, f_eof, f_runi, f_rune, f_merr, f_k, f_rc, f_ri, f_on, f_cb, f_cn, f_q, f_write_en} <= '0; + {f_e, f_eof, f_runi, f_rune, f_merr, f_k, f_rc, f_ri, f_on, f_cb, f_cn, f_q, f_write_en} <= 0; end else begin - logic [ 1:0] on; - logic [15:0] rc; - logic [ 4:0] ri; - logic [12:0] Ao; - logic [ 3:0] k; - logic [ 9:0] abserr; - logic [ 9:0] merr, Ainc; - logic map; - on = '0; - rc = (ef_fc|~ef_runi) ? '0 : f_rc; + on = 2'd0; + rc = (ef_fc|~ef_runi) ? 16'd0 : f_rc; ri = f_ri; Ao = (f_write_en & f_q==ef_q) ? f_An : (g_write_en & g_q==ef_q) ? g_An : ef_Ao; abserr = ef_err<$signed(10'd0) ? $unsigned(-ef_err) : $unsigned(ef_err); - merr='0; - Ainc='0; + merr = 10'd0; + Ainc = 10'd0; f_write_en <= ef_write_en; k = func_get_k(Ao, ef_No, ef_rt); - f_cb <= ef_fc ? '0 : f_rc; + f_cb <= ef_fc ? 16'd0 : f_rc; f_cn <= {1'b0,J[ri]} + 5'd1; if(ef_runi) begin - rc ++; + rc = rc + 16'd1; if(rc >= (16'd1< 16'd0)) - on++; + on = on + 2'd1; end else if(ef_rune) begin f_limit <= P_LIMIT - 5'd1 - {1'b0,J[ri]}; - if(ri > '0) ri --; - map = ~( (ef_err=='0) | ( (ef_err>$signed(10'd0)) ^ (k==4'd0 & ef_2BleN) ) ); + if (ri > 5'd0) ri = ri - 5'd1; + map = ~( (ef_err==10'd0) | ( (ef_err>$signed(10'd0)) ^ (k==4'd0 & ef_2BleN) ) ); merr = (abserr<<1) - {9'd0,ef_rt} - {9'd0,map}; Ainc = ((merr + {9'd0,~ef_rt}) >> 1); end else begin @@ -700,21 +748,21 @@ reg [ 4:0] g_zn; // in range of 0~27 reg [ 9:0] g_db; reg [ 3:0] g_dn; // in range of 0~13 +wire [ 9:0] f_w_merr_sk = (f_merr >> f_k); + always @ (posedge clk) begin g_sof <= f_sof & rstn; if(~rstn | f_sof) begin - {g_e, g_eof, g_runi, g_on, g_cb, g_cn, g_zn, g_db, g_dn} <= '0; + {g_e, g_eof, g_runi, g_on, g_cb, g_cn, g_zn, g_db, g_dn} <= 0; end else begin - logic [9:0] merr_sk; - merr_sk = f_merr >> f_k; g_e <= f_e; g_eof <= f_eof; g_runi <= f_runi; g_on <= f_on; - g_cb <= f_rune ? f_cb : '0; - g_cn <= f_rune ? f_cn : '0; - if(merr_sk < f_limit) begin - g_zn <= merr_sk[4:0]; + g_cb <= f_rune ? f_cb : 16'd0; + g_cn <= f_rune ? f_cn : 5'd0; + if(f_w_merr_sk < f_limit) begin + g_zn <= f_w_merr_sk[4:0]; g_db <= f_merr & ~(10'h3ff<> j_bcnt); bcnt = j_bcnt + {2'd0,h_bn}; if(bcnt >= 8'd16) begin j_e <= 1'b1; j_data[15:8] <= bbuf[247:240]; - if(bbuf[247:240] == '1) begin + if(bbuf[247:240] == 8'hFF) begin bbuf = {1'h0, bbuf[239:0], 7'h0}; - bcnt -= 8'd7; + bcnt = bcnt - 8'd7; end else begin bbuf = { bbuf[239:0], 8'h0}; - bcnt -= 8'd8; + bcnt = bcnt - 8'd8; end j_data[ 7:0] <= bbuf[247:240]; - if(bbuf[247:240] == '1) begin + if(bbuf[247:240] == 8'hFF) begin bbuf = {1'h0, bbuf[239:0], 7'h0}; - bcnt -= 8'd7; + bcnt = bcnt - 8'd7; end else begin bbuf = { bbuf[239:0], 8'h0}; - bcnt -= 8'd8; + bcnt = bcnt - 8'd8; end end else if(h_eof && bcnt > 8'd0) begin j_e <= 1'b1; j_data[15:8] <= bbuf[247:240]; - if(bbuf[247:240] == '1) + if(bbuf[247:240] == 8'hFF) j_data[ 7:0] <= {1'b0,bbuf[239:233]}; else j_data[ 7:0] <= bbuf[239:232]; - bbuf = '0; + bbuf = 248'd0; bcnt = 8'd0; end j_bbuf <= bbuf; @@ -816,7 +869,7 @@ end // make .jls file header and footer //------------------------------------------------------------------------------------------------------------------- reg [15:0] jls_wl, jls_hl; -wire[15:0] jls_header [13]; +wire[15:0] jls_header [0:12]; assign jls_header[0] = 16'hFFD8; assign jls_header[1] = 16'h00FF; assign jls_header[2] = 16'hF700; @@ -831,10 +884,11 @@ assign jls_header[10]= 16'h0101; assign jls_header[11]= {13'b0,NEAR}; assign jls_header[12]= 16'h0000; wire[15:0] jls_footer = 16'hFFD9; + always @ (posedge clk) if(~rstn) begin - jls_wl <= '0; - jls_hl <= '0; + jls_wl <= 16'd0; + jls_hl <= 16'd0; end else begin jls_wl <= {2'd0,a_wl} + 16'd1; jls_hl <= {2'd0,a_hl} + 16'd1; @@ -853,21 +907,21 @@ reg [15:0] k_data; always @ (posedge clk) begin k_last <= 1'b0; k_e <= 1'b0; - k_data <= '0; + k_data <= 16'd0; if(j_sof) begin - k_footer_i <= '0; + k_footer_i <= 1'b0; if(k_header_i < 4'd13) begin k_e <= 1'b1; k_data <= jls_header[k_header_i]; k_header_i <= k_header_i + 4'd1; end end else if(j_e) begin - k_header_i <= '0; - k_footer_i <= '0; + k_header_i <= 4'd0; + k_footer_i <= 1'b0; k_e <= 1'b1; k_data <= j_data; end else if(j_eof) begin - k_header_i <= '0; + k_header_i <= 4'd0; k_footer_i <= 1'b1; if(~k_footer_i) begin k_last <= 1'b1; @@ -875,8 +929,8 @@ always @ (posedge clk) begin k_data <= jls_footer; end end else begin - k_header_i <= '0; - k_footer_i <= '0; + k_header_i <= 4'd0; + k_footer_i <= 1'b0; end end @@ -884,7 +938,7 @@ end //------------------------------------------------------------------------------------------------------------------- // linebuffer for context pixels //------------------------------------------------------------------------------------------------------------------- -reg [7:0] linebuffer [1<<14]; +reg [7:0] linebuffer [0:(1<<14)-1]; always @ (posedge clk) // line buffer read c_d <= linebuffer[a_ii]; always @ (posedge clk) // line buffer write diff --git a/SIM/tb_jls_encoder.sv b/SIM/tb_jls_encoder.v similarity index 82% rename from SIM/tb_jls_encoder.sv rename to SIM/tb_jls_encoder.v index 5539997..4ca893a 100644 --- a/SIM/tb_jls_encoder.sv +++ b/SIM/tb_jls_encoder.v @@ -10,10 +10,10 @@ `timescale 1ps/1ps -`define NEAR 1 // NEAR can be 0~7 +`define NEAR 1 // NEAR can be 0~7 -`define FILE_NO_FIRST 1 // first input file name is test000.pgm -`define FILE_NO_FINAL 8 // final input file name is test000.pgm +`define FILE_NO_FIRST 1 // first input file name is test000.pgm +`define FILE_NO_FINAL 8 // final input file name is test000.pgm // bubble numbers that insert between pixels @@ -51,30 +51,33 @@ initial begin repeat(4) @(posedge clk); rstn<=1'b1; end // ------------------------------------------------------------------------------------------------------------------- // signals for jls_encoder_i module // ------------------------------------------------------------------------------------------------------------------- -reg i_sof = '0; -reg [13:0] i_w = '0; -reg [13:0] i_h = '0; -reg i_e = '0; -reg [ 7:0] i_x = '0; +reg i_sof = 0; +reg [13:0] i_w = 0; +reg [13:0] i_h = 0; +reg i_e = 0; +reg [ 7:0] i_x = 0; wire o_e; wire[15:0] o_data; wire o_last; -logic [7:0] img [4096*4096]; -int w = 0, h = 0; +reg [7:0] img [4096*4096-1:0]; +integer w = 0, h = 0; -task automatic load_img(input logic [256*8:1] fname); - int linelen, depth=0, scanf_num; - logic [256*8-1:0] line; - int fp = $fopen(fname, "rb"); - if(fp==0) begin +task load_img; + input [256*8:1] fname; + reg [256*8-1:0] line; + integer linelen, depth, scanf_num, fp, i; +begin + depth = 0; + fp = $fopen(fname, "rb"); + if (fp==0) begin $display("*** error: could not open file %s", fname); $finish; end linelen = $fgets(line, fp); - if(line[8*(linelen-2)+:16] != 16'h5035) begin + if (line[8*(linelen-2)+:16] != 16'h5035) begin $display("*** error: the first line must be P5"); $fclose(fp); $finish; @@ -87,17 +90,19 @@ task automatic load_img(input logic [256*8:1] fname); end scanf_num = $fgets(line, fp); scanf_num = $sscanf(line, "%d", depth); - if(depth!=255) begin + if (depth!=255) begin $display("*** error: images depth must be 255"); $fclose(fp); $finish; end - for(int i=0; i 0, insert bubble_control bubbles // when < 0, insert random 0~bubble_control bubbles // ------------------------------------------------------------------------------------------------------------------- -task automatic feed_img(input int bubble_control); - int num_bubble; - +task feed_img; + input integer bubble_control; + integer num_bubble, i; +begin // start feeding a image by assert i_sof for 368 cycles repeat(368) begin @(posedge clk) i_sof <= 1'b1; i_w <= w - 1; i_h <= h - 1; - {i_e, i_x} <= '0; + {i_e, i_x} <= 0; end // for all pixels of the image - for(int i=0; i