summaryrefslogtreecommitdiff
path: root/src/libcamera/software_isp/TODO
blob: 4fcee39b2891ec710326ab860d181d2cacd74780 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
1. Setting F_SEAL_SHRINK and F_SEAL_GROW after ftruncate()

>> SharedMem::SharedMem(const std::string &name, std::size_t size)
>> 	: name_(name), size_(size), mem_(nullptr)
>>
>> ...
>>
>> 	if (ftruncate(fd_.get(), size_) < 0)
>> 		return;
>
> Should we set the GROW and SHRINK seals (in a separate patch) ?

Yes, this can be done.
Setting F_SEAL_SHRINK and F_SEAL_GROW after the ftruncate() call above could catch
some potential errors related to improper access to the shared memory allocated by
the SharedMemObject.

---

2. Reconsider stats sharing

>>> +void SwStatsCpu::finishFrame(void)
>>> +{
>>> +	*sharedStats_ = stats_;
>> 
>> Is it more efficient to copy the stats instead of operating directly on
>> the shared memory ?
>
> I inherited doing things this way from Andrey. I kept this because
> we don't really have any synchronization with the IPA reading this.
>
> So the idea is to only touch this when the next set of statistics
> is ready since we don't know when the IPA is done with accessing
> the previous set of statistics ...
>
> This is both something which seems mostly a theoretic problem,
> yet also definitely something which I think we need to fix.
>
> Maybe use a ringbuffer of stats buffers and pass the index into
> the ringbuffer to the emit signal ?

That would match how we deal with hardware ISPs, and I think that's a
good idea. It will help decoupling the processing side from the IPA.

---

3. Remove statsReady signal

> class SwStatsCpu
> {
> 	/**
> 	 * \brief Signals that the statistics are ready
> 	 */
> 	Signal<> statsReady;

But better, I wonder if the signal could be dropped completely. The
SwStatsCpu class does not operate asynchronously. Shouldn't whoever
calls the finishFrame() function then handle emitting the signal ?

Now, the trouble is that this would be the DebayerCpu class, whose name
doesn't indicate as a prime candidate to handle stats. However, it
already exposes a getStatsFD() function, so we're already calling for
trouble :-) Either that should be moved to somewhere else, or the class
should be renamed. Considering that the class applies colour gains in
addition to performing the interpolation, it may be more of a naming
issue.

Removing the signal and refactoring those classes doesn't have to be
addressed now, I think it would be part of a larger refactoring
(possibly also considering platforms that have no ISP but can produce
stats in hardware, such as the i.MX7), but please keep it on your radar.

---

4. Hide internal representation of gains from callers

> struct DebayerParams {
> 	static constexpr unsigned int kGain10 = 256;

Forcing the caller to deal with the internal representation of gains
isn't nice, especially given that it precludes implementing gains of
different precisions in different backend. Wouldn't it be better to pass
the values as floating point numbers, and convert them to the internal
representation in the implementation of process() before using them ?

---

5. Store ISP parameters in per-frame buffers

> /**
>  * \fn void Debayer::process(FrameBuffer *input, FrameBuffer *output, DebayerParams params)
>  * \brief Process the bayer data into the requested format.
>  * \param[in] input The input buffer.
>  * \param[in] output The output buffer.
>  * \param[in] params The parameters to be used in debayering.
>  *
>  * \note DebayerParams is passed by value deliberately so that a copy is passed
>  * when this is run in another thread by invokeMethod().
>  */

Possibly something to address later, by storing ISP parameters in
per-frame buffers like we do for hardware ISPs.

---

6. Input buffer copying configuration

> DebayerCpu::DebayerCpu(std::unique_ptr<SwStatsCpu> stats)
> 	: stats_(std::move(stats)), gammaCorrection_(1.0)
> {
> 	enableInputMemcpy_ = true;

Set this appropriately and/or make it configurable.

---

7. Performance measurement configuration

> void DebayerCpu::process(FrameBuffer *input, FrameBuffer *output, DebayerParams params)
> /* Measure before emitting signals */
> if (measuredFrames_ < DebayerCpu::kLastFrameToMeasure &&
>     ++measuredFrames_ > DebayerCpu::kFramesToSkip) {
> 	timespec frameEndTime = {};
> 	clock_gettime(CLOCK_MONOTONIC_RAW, &frameEndTime);
> 	frameProcessTime_ += timeDiff(frameEndTime, frameStartTime);
> 	if (measuredFrames_ == DebayerCpu::kLastFrameToMeasure) {
> 		const unsigned int measuredFrames = DebayerCpu::kLastFrameToMeasure -
> 						    DebayerCpu::kFramesToSkip;
> 		LOG(Debayer, Info)
> 			<< "Processed " << measuredFrames
> 			<< " frames in " << frameProcessTime_ / 1000 << "us, "
> 			<< frameProcessTime_ / (1000 * measuredFrames)
> 			<< " us/frame";
> 	}
> }

I wonder if there would be a way to control at runtime when/how to
perform those measurements. Maybe that's a bit overkill.

---

8. DebayerCpu cleanups

> >> class DebayerCpu : public Debayer, public Object
> >>   const SharedFD &getStatsFD() { return stats_->getStatsFD(); }
> >
> > This,
>
> Note the statistics pass-through stuff is sort of a necessary evil
> since we want one main loop going over the data line by line and
> doing both debayering as well as stats while the line is still
> hot in the l2 cache. And things like the process2() and process4()
> loops are highly CPU debayering specific so I don't think we should
> move those out of the CpuDebayer code.

Yes, that I understood from the review. "necessary evil" is indeed the
right term :-) I expect it will take quite some design skills to balance
the need for performances and the need for a maintainable architecture.

> > plus the fact that this class handles colour gains and gamma,
> > makes me thing we have either a naming issue, or an architecture issue.
>
> I agree that this does a bit more then debayering, although
> the debayering really is the main thing it does.
>
> I guess the calculation of the rgb lookup tables which do the
> color gains and gamma could be moved outside of this class,
> that might even be beneficial for GPU based debayering assuming
> that that is going to use rgb lookup tables too (it could
> implement actual color gains + gamma correction in some different
> way).
>
> I think this falls under the lets wait until we have a GPU
> based SoftISP MVP/POC and then do some refactoring to see which
> bits should go where.

---

8. Decouple pipeline and IPA naming

> The current src/ipa/meson.build assumes the IPA name to match the
> pipeline name. For this reason "-Dipas=simple" is used for the
> Soft IPA module.

This should be addressed.

---

9. Doxyfile cleanup

>> diff --git a/Documentation/Doxyfile.in b/Documentation/Doxyfile.in
>> index a86ea6c1..2be8d47b 100644
>> --- a/Documentation/Doxyfile.in
>> +++ b/Documentation/Doxyfile.in
>> @@ -44,6 +44,7 @@ EXCLUDE                = @TOP_SRCDIR@/include/libcamera/base/span.h \
>>                            @TOP_SRCDIR@/src/libcamera/pipeline/ \
>>                            @TOP_SRCDIR@/src/libcamera/tracepoints.cpp \
>>                            @TOP_BUILDDIR@/include/libcamera/internal/tracepoints.h \
>> +                         @TOP_BUILDDIR@/include/libcamera/ipa/soft_ipa_interface.h \
> Why is this needed ?
>
>>                            @TOP_BUILDDIR@/src/libcamera/proxy/
>>     EXCLUDE_PATTERNS       = @TOP_BUILDDIR@/include/libcamera/ipa/*_serializer.h \
>> diff --git a/include/libcamera/ipa/meson.build b/include/libcamera/ipa/meson.build
>> index f3b4881c..3352d08f 100644
>> --- a/include/libcamera/ipa/meson.build
>> +++ b/include/libcamera/ipa/meson.build
>> @@ -65,6 +65,7 @@ pipeline_ipa_mojom_mapping = {
>>       'ipu3': 'ipu3.mojom',
>>       'rkisp1': 'rkisp1.mojom',
>>       'rpi/vc4': 'raspberrypi.mojom',
>> +    'simple': 'soft.mojom',
>>       'vimc': 'vimc.mojom',
>>   }
>>   diff --git a/include/libcamera/ipa/soft.mojom b/include/libcamera/ipa/soft.mojom
>> new file mode 100644
>> index 00000000..c249bd75
>> --- /dev/null
>> +++ b/include/libcamera/ipa/soft.mojom
>> @@ -0,0 +1,28 @@
>> +/* SPDX-License-Identifier: LGPL-2.1-or-later */
>> +
>> +/*
>> + * \todo Document the interface and remove the related EXCLUDE_PATTERNS entry.
> Ah that's why.

Yes, because, well... all the other IPAs were doing that...

> It doesn't have to be done before merging, but could you
> address this sooner than later ?

---

10. Switch to libipa/algorithm.h API in processStats

>> void IPASoftSimple::processStats(const ControlList &sensorControls)
>>
> Do you envision switching to the libipa/algorithm.h API at some point ?

At some point, yes.

---

11. Improve handling the sensor controls which take effect with a delay

> void IPASoftSimple::processStats(const ControlList &sensorControls)
> {
>       ...
>	/*
>	 * AE / AGC, use 2 frames delay to make sure that the exposure and
>	 * the gain set have applied to the camera sensor.
>	 */
>	if (ignore_updates_ > 0) {
>		--ignore_updates_;
>		return;
>	}

This could be handled better with DelayedControls.

---

12. Use DelayedControls class in ispStatsReady()

> void SimpleCameraData::ispStatsReady()
> {
> 	swIsp_->processStats(sensor_->getControls({ V4L2_CID_ANALOGUE_GAIN,
> 						    V4L2_CID_EXPOSURE }));

You should use the DelayedControls class.

---

13. Improve black level and colour gains application

I think the black level should eventually be moved before debayering, and
ideally the colour gains as well. I understand the need for optimizations to
lower the CPU consumption, but at the same time I don't feel comfortable
building up on top of an implementation that may work a bit more by chance than
by correctness, as that's not very maintainable.