Creating your own std::vector like class from scratch

Pages: 12
Hello everyone!

I've been looking at headers like vector , xmemory and stdexcept and
there is alot of stuff i dont understand.

I looked c++ tutorials also the new ones and they didn't include
those stuff what i see in those headers.

I want to learn all fundamental coding ways.
Usually i learn without looking at tutorials or asking help but
there is just so many things and i dont know how are they connected or whether they are at all.

Its feels like the first time coding.
The way this vector class is created is too different of the way like this:
1
2
3
4
5
6
7
8
9
10
11
class myclass
{
public:
myclass()
{
}

~myclass()
{
}
};

Im not even sure what to include here from vector header since
i have no idea what are they or how are things connected


from vector header.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
template<class _Myvec>
	class _Vector_const_iterator
		: public _Iterator012<random_access_iterator_tag,
			typename _Myvec::value_type,
			typename _Myvec::difference_type,
			typename _Myvec::const_pointer,
			typename _Myvec::const_reference,
			_Iterator_base>
	{	// iterator for nonmutable vector
public:
	typedef _Vector_const_iterator<_Myvec> _Myiter;
	typedef random_access_iterator_tag iterator_category;

	typedef typename _Myvec::value_type value_type;
	typedef typename _Myvec::difference_type difference_type;
	typedef typename _Myvec::const_pointer pointer;
	typedef typename _Myvec::const_reference reference;
	typedef typename _Myvec::pointer _Tptr;

I haven't heard about them before or see them.
I only see those types of things in headers what are included with compiler.

Can anyone tell me where should i look or how should i even start to learn
to understand everything what there is in vector header`?

Because of those things i listed in code, i dont understand anything
what is going on in vector header.

Just in case including the source code of vector header here:
http://pastebin.com/wNRinHTi

Thanks!



Last edited on
I want to learn all fundamental coding ways.
Usually i learn without looking at tutorials or asking help but
there is just so many things and i dont know how are they connected or whether they are at all.
The implementation of the standard library not a good source to learn programming.
If you are a beginner you should stop here and learn something else.
You may be able to look at this code in 2 years and think "yeah, I understand everything" but you won't learn much from reading it.

if you don't stop... well... good luck :^)
But keep in mind that that is not that this class is not different from normal classes:

template<class _Myvec>
template definition

1
2
3
4
5
6
7
class _Vector_const_iterator
		: public _Iterator012<random_access_iterator_tag,
			typename _Myvec::value_type,
			typename _Myvec::difference_type,
			typename _Myvec::const_pointer,
			typename _Myvec::const_reference,
			_Iterator_base>
inheritance

1
2
3
4
5
6
7
8
9
public:
	typedef _Vector_const_iterator<_Myvec> _Myiter;
	typedef random_access_iterator_tag iterator_category;

	typedef typename _Myvec::value_type value_type;
	typedef typename _Myvec::difference_type difference_type;
	typedef typename _Myvec::const_pointer pointer;
	typedef typename _Myvec::const_reference reference;
	typedef typename _Myvec::pointer _Tptr;
some typedefs

if you really want to learn it look into the keywords "C++ templates", "C++ inheritance", and "C++ typedef"
Last edited on
I understood templates, inheritance and typedef but the way
everything was used there, was new for me but now after reading the code of vector
i'm not scared of them. I think i understand now.

Anyways, i now understand your recommendation 'standard library is not a good source to learn programming' all tho i got some new tips and tricks from it
and i would now write my own vector's code differently.

Anyways here's my vector code:http://pastebin.com/5Ri5Jxky
(its an old way, before i knew new ways)

No matter how hard i try, my vector's remove function is slower than default vector's erase.
I just can't understand why.

I tried copying the default vectors erase functions.
I used
1
2
3
4
5
6
7
8
9
10
// i think that was the definition of the function what default vector was using in erase func
template<class _InIt,
	class _OutIt> inline
	_OutIt _Move(_InIt _First, _InIt _Last,
		_OutIt _Dest, _Nonscalar_ptr_iterator_tag)
	{	// move [_First, _Last) to [_Dest, ...), arbitrary iterators
	for (; _First != _Last; ++_Dest, ++_First)
		*_Dest = _STD move(*_First);
	return (_Dest);
	}


anyways, by using that function, my remove function was 20 time slower.
I'm really confused.
How does vector move items?
What kind of sorcery is going on there xD?

Its so hard to read the code and understand it because each definition leads to the other one
and so on...

I have to memories everything but there are just so many definitions ...
So if there is anyone who understand that witchcraft, please
make a simple small code example or something what would explain
how things are working there.

Thank you! <3



Hello,

a vector is a memory array, but one of all it is an iteratable container, the missing piece in your code, moreover you should move on a re-use model.(1)

The STL is an interesting source of knowledge for advanced programmer ;

if you want to reproduce a vector : 2 parts/objects you need 8-) : iterator, and an efficient memory buffer(1). (there are a lot of concepts and algo here to learn)

when you understand and have those pieces, you might build a vector container on top.

I am going to give you a start for an iterator, it does not take a pointer but an int, size_t , the second parameter is the friend, the container on which you want to implement begin() and end()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
template <typename ItemT, class FriendT>
class iterator_facet
{
friend FriendT;

protected:
   typedef iterator_facet<ItemT, FriendT> facet;
   typedef const iterator_facet<ItemT, FriendT> const_facet;
   typedef ItemT value_type;
   
protected:
   iterator_facet(value_type beg) : m_val (beg) { /* NOP */ }
   
public:
   value_type operator * () const { return m_val; }
   
   const_facet & operator ++ () { ++m_val; return *this; }
   facet operator ++ (int) { facet cpy(*this); ++m_val; return cpy; }
   
   const_facet & operator -- () { --m_val; return *this; }
   facet operator -- (int) { facet cpy(*this); --m_val; return cpy; }
   
   facet operator + (value_type n) const { return facet(m_val + n); }
   const_facet & operator += (value_type n) { m_val += n; return *this; }
   
   facet operator - (value_type n) const { return facet(m_val -n); }
   const_facet & operator -= (value_type n) { m_val -= n; return *this; }
   
   bool operator == (const_facet & other_it) const { return m_val == other_it.m_val; }
   bool operator != (const_facet & other_it) const { return m_val != other_it.m_val; }
   
private:
   value_type m_val;
};


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// example 
class your_class
{
    typedef iterator_facet<std::size_t, your_class> iterator;

    iterator begin(); // return start position 
    iterator end();  // return end position 
};

your_class  some(...);

for (auto item : some)
{
    std::cout << item << std::endl;
}

for (your_class::iterator it = some.begin() ; it != some.end(); ++it)
{
   std::cout << *it << " " << std::endl;
}

Last edited on
I'm having one crazy idea how to boost the speed of moving.
Instead of moving data, we would move the pointer.

We need 2 arrays
T means datatype, it can be int, char or what ever data type/size
P means pointer of the data, its size is
32bit system: int
63bit system: long long int

Lets imagine that data is char and letters
T data = [a][b][c][d][e][f][g][h]
P pointer = [0][1][2][3][4][5][6][7]

to access my data i do this:
cout << "data:" (T)pointer[0] << endl;
data:a


now to remove first element i would do this:
1
2
3
element_to_remove = 0
datasize = 8
for( a = element_to_remove; a < datasize - 1; a++ ) pointer[a]++;


now things look like this:
T data = [a][b][c][d][e][f][g][h]
P pointer = [1][2][3][4][5][6][7][7]
cout << "data:" (T)pointer[0] << endl;
data:b


the problem is keeping track of empty or unused array places.
I need to write down those pointer values what i dont use anymore
so i could reuse them if i need them again.
For example i would need one what i removed when i push new item into my vector/array thingy.
I could do something like this:
1
2
3
4
5
6
7
8
9
10
11
myvectorsize = 8
T data[myvectorsize]
P pointer[myvectorsize]
P reserved[myvectorsize]
reservedsize = 0
// now lets remove first element:
item_to_remove = 0
reserved[reservedsize++] = pointer[item_to_remove];
for( a = item_to_remove; a < myvectorsize - 1; a++ ) pointer[a]++;
// Btw, pointer[a] += sizeof(t)
// just in case pointing that out... 

but i would use so much memory :/


Thats one thing i suspected what vector was doing but i guess its not doing it.

(there are a lot of concepts and algo here to learn)

Can you give me link for some?
I'm googling things yet i can't find anything what would help me to understand
the witchcraft of default vector.
Most of the thing i found point me into standard libraries what is not what i want.
I want to learn how one standard library works.

Others tell me the stuff i already know and have already used in this code:
http://pastebin.com/5Ri5Jxky



Last edited on
confusing part here:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// code part from default vector class
	void _Reallocate(size_type _Count)
	{	// move to array of exactly _Count elements
		pointer _Ptr = this->_Getal().allocate(_Count);

		_TRY_BEGIN
			_Umove(this->_Myfirst, this->_Mylast, _Ptr);
		_CATCH_ALL
			this->_Getal().deallocate(_Ptr, _Count);
		_RERAISE;
		_CATCH_END

			size_type _Size = size();
		if (this->_Myfirst != pointer())
		{	// destroy and deallocate old array
			_Destroy(this->_Myfirst, this->_Mylast);
			this->_Getal().deallocate(this->_Myfirst,
				this->_Myend - this->_Myfirst);
		}

		this->_Orphan_all();
		this->_Myend = _Ptr + _Count;
		this->_Mylast = _Ptr + _Size;
		this->_Myfirst = _Ptr;
	}


functions
deallocate and allocate, where's the definition of them?
I can't find the source code of those functions.

That is the main reason why i find this whole thing sorcery.
Hello,

PART1

I am happy to hear that "I'm having one crazy idea how to boost the speed of moving. Instead of moving data, we would move the pointer." 8-)

"the problem is keeping track of empty or unused array places." yes mostly, you just hit a problematic ; for instance an address stack inserting in the middle, but forget about that issue for now and focus on the iterator concern and position.

PART2

There is something that is called "core-language", you won't find it into the STL, C++ has a runtime library too 8-) ABI, and I would say in general, the most interesting parts of a language are sited into its compiler front-end.

"i find this whole thing sorcery" 8-D, that's just the beginning, but continue digging-in and sweating, because this is the only good way to learn, no theory can replace that neither a teacher,

papy mumbling mode begin

I learnt C by reading examples given by a teacher who was a programmer hobbyist and noticed that I had an interest in it, the internet was not existing (neither windows 2.0 8-) ), there was no computer at home, few at school, was really expensive at this time and I was not born in a wealthy family, in a redneck lost hole, but I remember spending nights during months deciphering the language, the grammar, what all this weirdo signs did, meant and how a computer was able to use that crap, no book, no documentation, no internet!, and barely/briefly having an access to a computer after class to test the new thinkings.

papy mumbling mode end
Last edited on
now i understand most things yet one function:
 
this->_Orphan_all();


its from std:vector::push_back function.
It doesn't seem to affect anything if i remove it.

The only thing it does right now in my eyes is that it's letting
some cpu ticks for waste.

BTW: i wasn't learning c++ for whole month xD ( i swear im not that stupid )
I took a break ... i usually do that when something comes along what i dont understand.
Last edited on
The only thing it does right now in my eyes is that it's letting
some cpu ticks for waste.

nah, if the function does nothing the compiler will optimise

But yeah, you're right, that function might have an empty body...
Sadly I can't tell you more than I read in the last few minutes:
http://stackoverflow.com/questions/11572845/why-is-visual-c-2010-still-calling-orphan-all-when-a-vector-is-destroyed-with
whats the differents now:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
		void grow(size_type newsize)
		{
			// pointer _Ptr = this->_Getal().allocate(newsize);
			pointer _Ptr = new _Ty[newsize];

			_TRY_BEGIN
				_Umove(this->data, this->_Mylast, _Ptr);
			_CATCH_ALL
				// this->_Getal().deallocate(_Ptr, newsize);
				delete[] _Ptr;
			_RERAISE;
			_CATCH_END

			if (this->data != pointer())
			{
				_Destroy(this->data, this->_Mylast);
				// this->_Getal().deallocate(this->data, this->_Myend - this->data);
				delete[] this->data;
			}

			this->_Orphan_all();
			this->_Myend = _Ptr + newsize;
			this->_Mylast = _Ptr + size();
			this->data = _Ptr;
		}

using new/delete instead of allocate/deallocate?


full source of my vector class:
http://pastebin.com/0aVeW0Nf
whats the differents now: using new/delete instead of allocate/deallocate?
Now it does not satisfy standard vector requirement: being able to provide your own memory allocation functions. (like what will you do if you want your vector to allocate data only in shared memory so other processes have access to it?)

Also I must warn that standard library implementation is not always written in standard C++. Often it uses platform or compiler-dependend tricks which will not work if you switch compilers/OS or if standard library has special threatment.
Last edited on
i shall use the allocate/deallocate then.
If my vector code works fine in linux with g++ compiler then everything would be fine.

I'm not going to support mac or any other os.
Only supporting win32/64 and linux32/64/128
I'm not going to support mac or any other os.
Only supporting win32/64 and linux32/64/128

You're making it look like you REALLY want to use your own vector class in your projects, is that so?
Yes and the next one will be std::string.
Anyways, Im trying to create a move function.

Here's how it looks like so far
1
2
3
		void move(size_type what, size_type to)
		{
		}

...

I tried to use std functions and tricks what vector does in order to create this effect:
1
2
3
4
5
6
sp::cvec v(10) = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
cout << "v:"; for (auto a: v) { cout << v[a] << ","; }; cout << endl;
v.move( 0, 5 );
cout << "v:"; for (auto a: v) { cout << v[a] << ","; }; cout << endl;
v.move( 9, 3 );
cout << "v:"; for (auto a: v) { cout << v[a] << ","; }; cout << endl;



v:0,1,2,3,4,5,6,7,8,9,
v:5,0,1,2,3,4,6,7,8,9,
v:5,0,1,8,2,3,4,6,7,9,


sadly I'm not sharp enough to find a way.
All tho i can always use old boring way:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
template<class _Ty, class _Alloc = allocator<_Ty> >
class cvec : public _calloc < !is_empty<_Alloc>::value, _vbasetypes<_Ty, _Alloc> >
...

void move( size_type what, size_type to )
{
	_Ty temp;
	if( what > to )
	{
		temp = data[to];
		int a = what;
		while( a > to ) data[a--] = data[a];
		data[what] = temp;
		return
	}
	temp = data[to];
	int a = what;
	while( a < to ) data[a++] = data[a];
	data[what] = temp;
	return
}
}


But the past has shown me that its not a good idea.
In past i tried to recreate vector functions without c++11/14 iterator, container, allocator
with old way using loops and they were slower.

Oh also listing here where i found information about everything in order
to understand just enough to create my own vector class ( by copying std one lel )

http://stdcxx.apache.org/doc/stdlibref/containers.html
http://stdcxx.apache.org/doc/stdlibref/allocator.html
http://stdcxx.apache.org/doc/stdlibref/iterators.html
http://stdcxx.apache.org/doc/stdlibug/16-3.html
Last edited on
Yes and the next one will be std::string.
I would not mind if it were for practice but your classes won't be as good as the standard libraries, those were written by really good people.
Why do you want to use them?
They will be faster and from my eyes, better design.
I'm going to test/compare it's functions speeds to std vector on different os. Right now in my os, some functions are already faster and non slower.
Also my vector allows access buffer pointer directly ( myvec.data )

myvec.data[x] is faster than myvec[x]
I'm not sure why tho ...

I know its created by really good people and chance for me out done them is small.
My class won't be worse, it would be same or better. Those are options.
But since design is different, its matter of taste.

Btw if you would see my string class then I'm 75% sure that you're going to like the idea
( not the code, thats for sure ... I'l try my best )



=========================
Anyways, any ideas about my move function ( 2 posts above me. )
Last edited on
They will be faster and from my eyes, better design.

I hope they also won't leak any resources and have some exception safety like the basic-gurantee.

I'm going to test/compare it's functions speeds to std vector on different os. Right now in my os, some functions are already faster and non slower.

Mind sharing your code and how you measure the difference?
When you compared performance, did you compile with optimizations turned on? In Visual C++ also make sure _SECURE_SCL is defined as 0 otherwise it will do a lot of costly bounds checking.

It's very much possible to design a vector class that has better performance for certain use cases, but that probably means you will have to give up performance for other operations.
Well, I was also able to implement a faster push_back but I don't understand why it's that much faster...

Did i sacrifice exception gurantee or safety somehow?
I'd say my push_back method everytime offers a strong-guarantee and the objects inside the vector allways stay in a valid state.

Is there something I'm missing when using memcpy?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
#include <iostream>
#include <chrono>
#include <vector>
#include <cstring>
#include <iomanip>

using namespace std::chrono;

template <typename T, class alloc_t=std::allocator<T>>
class myvec {
public:
    typedef std::size_t size_type;
    typedef alloc_t allocator_type;
    typedef const allocator_type const_allocator_type;
    
    typedef T value_type;
    typedef value_type& reference;
    typedef const value_type& const_reference;
    typedef value_type* pointer;
    typedef const pointer const_pointer;
    
    // constructors
    myvec()
        : allocator_()
        , start_()
        , size_()
        , capacity_()
    {}
    
    myvec(const myvec& vec)
        : allocator_(vec.allocator())
        , start_(allocator().allocate(vec.size()))
        , size_(vec.size())
        , capacity_(vec.size())
    {
        memcpy(start_, vec.start(), size());
    }
    
    myvec(myvec&& vec)
        : allocator_(std::move(vec.allocator_))
        , start_(std::move(vec.start_))
        , size_(std::move(vec.size_))
        , capacity_(std::move(vec.capacity_))
    {
        vec.start_ = 0;
        vec.size_ = 0;
        vec.capacity_ = 0;
    }
    
    ~myvec() {
        if(start_) {
            for(std::size_t i = 0; i < size(); ++i)
                start_[i].~value_type();
            
            allocator().deallocate(start_, capacity());
        }
    }
    
    // info
    size_type capacity() const { return capacity_; }
    size_type size() const { return size_; }
    
    allocator_type& allocator() { return allocator_; }
    const_allocator_type& allocator() const { return allocator_; }
    
    // modify
    void push_back(const_reference value) {
        if(capacity() == size()) // if the vector is full allocate new memory
        {
            // get new-needed space (geometric growth factor 2)
            if(capacity() != 0)
                capacity_ *= 2;
            else
                capacity_ = 1;
            
            // allocate space
            pointer temp = allocator().allocate(capacity());
            
            // then move the elements over and create the new element
            memcpy(temp, start_, size() * sizeof(value_type));
            
            allocator().deallocate(start_, size());
            start_ = temp;
        }
        new (data() + size()) value_type(value);
        
        ++size_;
    }
    
    // access
    pointer data() { return start_; }
    const_pointer data() const { return start_; }
    reference operator[](size_type position) { return start_[position]; }
    const_reference operator[](size_type position) const { return start_[position]; }
    
private:
    allocator_type allocator_;
    pointer start_;
    size_type size_;
    size_type capacity_;
};

void vectest()
{
    const int TESTS = 1000;
    const int ITEMS = 10000;
    auto dur = high_resolution_clock::duration();
    auto mydur = high_resolution_clock::duration();
    
    typedef int type;
    type obj;
    
    for(int i = 0; i < TESTS; ++i) {
        std::vector<type> vec;
        myvec<type> myvec;
        
        high_resolution_clock::time_point tp0 = high_resolution_clock::now();
        for(int i = 0; i < ITEMS; ++i) {
            vec.push_back(obj);
        }
        high_resolution_clock::time_point tp1 = high_resolution_clock::now();
        dur += tp1 - tp0;
        
        tp0 = high_resolution_clock::now();
        for(int i = 0; i < ITEMS; ++i) {
            myvec.push_back(obj);
        }
        tp1 = high_resolution_clock::now();
        mydur += tp1 - tp0;
    }
    dur /= TESTS;
    mydur /= TESTS;
    
    std::cout << std::setw(15) << "std::vector: " << dur.count() << std::endl;
    std::cout << std::setw(15) << "myvec: " << mydur.count() << std::endl;
    
    std::cout << "\n" << std::endl;
}

int main(void)
{
    vectest();
    
    return 0;
}

  std::vector: 45101
        myvec: 27340

http://cpp.sh/7zfl
Last edited on
Pages: 12