diff options
author | Milan Zamazal <mzamazal@redhat.com> | 2025-03-26 10:08:47 +0100 |
---|---|---|
committer | Kieran Bingham <kieran.bingham@ideasonboard.com> | 2025-03-26 10:45:01 +0000 |
commit | e2b4000dc9cc9440234540a4fe28d2f08d84f5f3 (patch) | |
tree | df86409448ea57e2072af3526fa6f15f057c560b /src/ipa/simple/algorithms | |
parent | ac3068655643a8b2e9a5d002ad6fab104832e1c0 (diff) |
libcamera: software_isp: Apply CCM in debayering
This patch applies color correction matrix (CCM) in debayering if the
CCM is specified. Not using CCM must still be supported for performance
reasons.
The CCM is applied as follows:
[r1 g1 b1] [r]
[r2 g2 b2] * [g]
[r3 g3 b3] [b]
The CCM matrix (the left side of the multiplication) is constant during
single frame processing, while the input pixel (the right side) changes.
Because each of the color channels is only 8-bit in software ISP, we can
make 9 lookup tables with 256 input values for multiplications of each
of the r_i, g_i, b_i values. This way we don't have to multiply each
pixel, we can use table lookups and additions instead. Gamma (which is
non-linear and thus cannot be a part of the 9 lookup tables values) is
applied on the final values rounded to integers using another lookup
table.
Because the changing part is the pixel value with three color elements,
only three dynamic table lookups are needed. We use three lookup tables
to represent the multiplied matrix values, each of the tables
corresponding to the given matrix column and pixel color.
We use int16_t to store the precomputed multiplications. This seems to
be noticeably (>10%) faster than `float' for the price of slightly less
accuracy and it covers the range of values that sane CCMs produce. The
selection and structure of data is performance critical, for example
using bytes would add significant (>10%) speedup but would be too short
to cover the value range.
The color lookup tables can be represented either as unions,
accommodating tables for both the CCM and non-CCM cases, or as separate
tables for each of the cases, leaving the tables for the other case
unused. The latter is selected as a matter of preference.
The tables are copied (as before), which is not elegant but also not a
big problem. There are patches posted that use shared buffers for
parameters passing in software ISP (see software ISP TODO #5) and they
can be adjusted for the new parameter format.
Color gains from white balance are supposed not to be a part of the
specified CCM. They are applied on it using matrix multiplication,
which is simple and in correspondence with future additions in the form
of matrix multiplication, like saturation adjustment.
With this patch, the reported per-frame slowdown when applying CCM is
about 45% on Debix Model A and about 75% on TI AM69 SK.
Using std::clamp in debayering adds some performance penalty (a few
percent). The clamping is necessary to eliminate out of range values
possibly produced by the CCM. If it could be avoided by adjusting the
precomputed tables some way then performance could be improved a bit.
Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Diffstat (limited to 'src/ipa/simple/algorithms')
-rw-r--r-- | src/ipa/simple/algorithms/lut.cpp | 53 | ||||
-rw-r--r-- | src/ipa/simple/algorithms/lut.h | 1 |
2 files changed, 42 insertions, 12 deletions
diff --git a/src/ipa/simple/algorithms/lut.cpp b/src/ipa/simple/algorithms/lut.cpp index 3a3daed7..a06cdeba 100644 --- a/src/ipa/simple/algorithms/lut.cpp +++ b/src/ipa/simple/algorithms/lut.cpp @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: LGPL-2.1-or-later */ /* - * Copyright (C) 2024, Red Hat Inc. + * Copyright (C) 2024-2025, Red Hat Inc. * * Color lookup tables construction */ @@ -80,6 +80,11 @@ void Lut::updateGammaTable(IPAContext &context) context.activeState.gamma.contrast = contrast; } +int16_t Lut::ccmValue(unsigned int i, float ccm) const +{ + return std::round(i * ccm); +} + void Lut::prepare(IPAContext &context, [[maybe_unused]] const uint32_t frame, [[maybe_unused]] IPAFrameContext &frameContext, @@ -91,22 +96,46 @@ void Lut::prepare(IPAContext &context, * observed, it's not permanently prone to minor fluctuations or * rounding errors. */ - if (context.activeState.gamma.blackLevel != context.activeState.blc.level || - context.activeState.gamma.contrast != context.activeState.knobs.contrast) + const bool gammaUpdateNeeded = + context.activeState.gamma.blackLevel != context.activeState.blc.level || + context.activeState.gamma.contrast != context.activeState.knobs.contrast; + if (gammaUpdateNeeded) updateGammaTable(context); auto &gains = context.activeState.awb.gains; auto &gammaTable = context.activeState.gamma.gammaTable; const unsigned int gammaTableSize = gammaTable.size(); - - for (unsigned int i = 0; i < DebayerParams::kRGBLookupSize; i++) { - const double div = static_cast<double>(DebayerParams::kRGBLookupSize) / - gammaTableSize; - /* Apply gamma after gain! */ - const RGB<float> lutGains = (gains * i / div).min(gammaTableSize - 1); - params->red[i] = gammaTable[static_cast<unsigned int>(lutGains.r())]; - params->green[i] = gammaTable[static_cast<unsigned int>(lutGains.g())]; - params->blue[i] = gammaTable[static_cast<unsigned int>(lutGains.b())]; + const double div = static_cast<double>(DebayerParams::kRGBLookupSize) / + gammaTableSize; + + if (!context.ccmEnabled) { + for (unsigned int i = 0; i < DebayerParams::kRGBLookupSize; i++) { + /* Apply gamma after gain! */ + const RGB<float> lutGains = (gains * i / div).min(gammaTableSize - 1); + params->red[i] = gammaTable[static_cast<unsigned int>(lutGains.r())]; + params->green[i] = gammaTable[static_cast<unsigned int>(lutGains.g())]; + params->blue[i] = gammaTable[static_cast<unsigned int>(lutGains.b())]; + } + } else if (context.activeState.ccm.changed || gammaUpdateNeeded) { + Matrix<float, 3, 3> gainCcm = { { gains.r(), 0, 0, + 0, gains.g(), 0, + 0, 0, gains.b() } }; + auto ccm = gainCcm * context.activeState.ccm.ccm; + auto &red = params->redCcm; + auto &green = params->greenCcm; + auto &blue = params->blueCcm; + for (unsigned int i = 0; i < DebayerParams::kRGBLookupSize; i++) { + red[i].r = ccmValue(i, ccm[0][0]); + red[i].g = ccmValue(i, ccm[1][0]); + red[i].b = ccmValue(i, ccm[2][0]); + green[i].r = ccmValue(i, ccm[0][1]); + green[i].g = ccmValue(i, ccm[1][1]); + green[i].b = ccmValue(i, ccm[2][1]); + blue[i].r = ccmValue(i, ccm[0][2]); + blue[i].g = ccmValue(i, ccm[1][2]); + blue[i].b = ccmValue(i, ccm[2][2]); + params->gammaLut[i] = gammaTable[i / div]; + } } } diff --git a/src/ipa/simple/algorithms/lut.h b/src/ipa/simple/algorithms/lut.h index 889f864b..77324800 100644 --- a/src/ipa/simple/algorithms/lut.h +++ b/src/ipa/simple/algorithms/lut.h @@ -33,6 +33,7 @@ public: private: void updateGammaTable(IPAContext &context); + int16_t ccmValue(unsigned int i, float ccm) const; }; } /* namespace ipa::soft::algorithms */ |