手把手教你在 C++17 中从零开始实现 ranges 库
手把手教你在 C++17 中从零开始实现 ranges 库
关于为什么 C++20 引入了 ranges 库却没有 zip 还要用户自己实现这件事
So in this post, you will learn how to write a ranges library from scratch in C++17,
with some useful functionality that doesn’t exists in the C++20 library <ranges>
,
like enumerate
or zip
.
C++11 range-based loop
It’s very convinent to use range-based for loop since C++11:
1 | std::vector<int> list; |
C++20 ranges library
But sometimes, I wonder what if I need to get the index while iterating over a range?
In Python, we can use enumerate
, zip
, map
, and so on.
While there is no equivalant functions in C++, until C++20 which introduced many
range operations like std::views::transform
for map
in Python, use it like this:
1 |
|
It will apply + 1
to every value in list
before calling into the loop body.
Yeah, the operator|
here - I’d like to call it the pipe style, is even more convinent than
Python’s function style range operations. Especially when the pipe is long:
1 |
|
Of course, the function style also supported in C++20:
1 |
|
The lambda definition looks too verbose here, let’s simplify it by a macro:
1 |
|
Wait a miniute…
However, the transform
or map
isn’t so useful after all.
While they don’t even have zip
and enumerate
!!!
They are very commonly used in daily programming. Otherwise I still have to write for (int i...
when the index is necessary.
So I decide to write my only ranges library rather than waiting for C++ standard to
support this.
My ranges library!
First of all, we need to define a generic range
class.
1 | template <class It> |
Then, we we want to create a range for a std::vector<int>
, we can use:
1 | std::vector<int> list; |
To be generic, we can use decltype
to automatically determine the iterator type of a given object:
1 | some_type_that_we_dont_know list; |
Compile-time argument deduction
But this way we will have to write range<decltype(t.begin())>(t.begin(), t.end())
every time… Don’t worry!
We can use the complile-time argument deduction (CTAD) feature since C++17 to automatically deduce the<class It>
with given rule when argument has a confirmed type:
1 | template <class It> |
Then, when range(t.begin(), t.end())
is called, it will be automatically deduced as range<decltype(t.begin())>(t.begin(), t.end())
.
The pipable class
Given that most transformations only takes one range
as input, thus can support the pipe style syntax (excluding zip
which requires multiple input), we’d like to define a common class for all kind of range operators, called pipable
:
1 | template <class F> |
It is able to automatically convert f(arg)
to pipe syntax arg | f
.
Where the following function used the ...
syntax since C++11:
1 | template <class ...Rs> |
is equivalant to:
1 | template <class R1, class R2, and_so_on> |
with flexible number of arguments.
Implementing map
First let’s implement the easiest range operator: map
To do so, we need to implement our custom iterators, which requires many knowledge on what C++ iterators actually are.
range-based for loops are 纸老虎
For example, the following C++11 range-based for loop:
1 | for (auto x: list) { |
Is nothing more than a shortcut for:
1 | for (auto it = list.begin(); it != list.end(); ++it) { |
Where the begin
and end
are duck-typing method names for all objects to be considered iterable, so if you write:
1 | int not_iterable = 42; |
The compiler will complain something like: error: int
have no member named begin
.
That’s because range-based for loops are just… nothing more than shortcuts…
Iterators are classes trying to imitate pointers
So what actually the type it
is?
1 | for (auto it = list.begin(); it != list.end(); ++it) { |
Looking at the *it
here, we wonder if it
is actually a pointer, i.e. int *
?
Not really, the real type of list.begin()
is vector<int>::iterator
, which is a class with these operator overloading methods defined:
operator*()
for getting the pointed value with*it
operator++()
for iterate to the next value with++it
operator!=()
for comparing two iterators to see if it comes toend()
As you can see, C++ iterators are classes trying to behave as if it is a pointer, for allowing algorithm reuse ability, and minimize
the effort for C programmers to adapt from C pointers to C++ iterators.
So to implement custom iterators…
So if we want to implement our own iterators, we can simply define the above three operator overloading methods to control the behavior.
Now that we want to implement map
, we acutally is willing the following code:
1 | for (auto x: list | map(func)) { |
to be translated into:
1 | for (auto it = list.begin(); it != list.end(); ++it) { |
right?
So why not just overload the operator*()
to make it return func(*it)
instead, while other operators remain the same.
Simple, just define a wrapper class:
1 | template <class Func, class Base> |
And the functor map
that returns a range of map_iterator
:
1 | template <class F> |
Here we used
map_iterator{...}
rather thanmap_iterator(...)
so that we don’t have to write the constructor ourselves but ask
the compiler to assign its members with default order (since C++11).
Then test it in main function:
1 | int main() { |
Compile and run it:
1 | $ g++ -std=c++20 a.cpp && ./a.out |
Succeed! We managed to make our first step in implementing ranges library ourselves!
What’s next?
Next, we will go ahead to implement the enumerate
and zip
as well with the same
technique we learnt from implementing the map
.
Implement enumerate
Now that we want to implement enumerate
, we acutally is willing the following code:
1 | for (auto [x, y]: list | enumerate(func)) { |
to be translated into:
1 | size_t index = 0; |
right?
Here, the
auto [x, y] = ...
is the structual binding syntax since C++17,
and...
can be astd::pair
, orstd::tuple
, or any thing unpackable, we will
take a deep look into this later.
So now we need to overload operator*()
to make it return a pair of values, whose
first value is the index, which is incresed during operator++()
, like this:
1 | template <class Base> |
And since the enumerate takes no Func
as input like map
does, we can
simply define enumerate
as a global variable:
1 | static constexpr auto enumerate = pipable([] (auto &&r) { |
The
static
here make sure that the symbol doesn’t conflict when the header
is being included for multiple times in a single project.
Test it again:
1 | int main() { |
Or use the function style if you like:
1 | int main() { |
It should outputs:
1 | 0 1 |
Worked! Congrats on being Pythonic in modern C++ programming :)
Go ranges and no more for (int i = 0; ...
bolierplates!
What about zip?
Homework time! Please try out what you’ve learnt by implement the zip
yourself, to test and solidify your new skill :)
For example:
1 | int main() { |
should outputs:
1 | 1 3 |
I also prepared some challenge for curious readers, here they goes:
challenge A
- Make
zip
resulting range size to be the minimum of all input ranges.
For example:
1 | int main() { |
should outputs:
1 | 1 3 |
challenge B
- Make
zip
takesn
number of ranges as input, whileoperator*()
returns astd::tuple
with sizen
.
For example:
1 | int main() { |
should outputs:
1 | 1 3 3.14 |
Yes, the Python
zip
can pass above two ‘challenge’ :)
Finally…
You may ‘submit’ the ‘homework’ via 评论区, or GitHub, or any
other way that fesible. Submitting this homework won’t affect
anyone’s GPA or KPI, but this one is just for fun!
Below is the final source code of this tutorial, your homework
may be either based on it or completely from scratch:
1 | // compile with -std=c++17 |
Also checkout my personal website [国内镜像站]
where this post were uploaded, Zhihu will also be uploaded synchronously.