computer science goes bonk / all posts / rss / about

feuding with std::numeric_limits

My long-standing feud with std::numeric_limits may be coming to a close:

#include <iostream>
#include <limits>

int main() {
  std::cout                                  // example output:
    << std::numeric_limits<short>::min()     // -32768
    << std::numeric_limits<short>::max()     //  32767

    << std::numeric_limits<int>::min()       // -2147483648
    << std::numeric_limits<int>::max()       //  2147483647

    << std::numeric_limits<double>::min()    //  2.22507e-308...wait, what?
    << std::numeric_limits<double>::max()    //  1.79769e+308
}

The minimum value for a double is basically zero? Get outta here.

Fortunately, C++11 came to its senses -- partially -- and provides std::numeric_limits::lowest as a way of getting what most people expect to get from std::numeric_limits::min:

#include <iostream>
#include <limits>

int main() {
  std::cout                                  // example output:
    << std::numeric_limits<double>::lowest() // -1.79769e+308 (yay!)
    << std::numeric_limits<double>::max()    //  1.79769e+308   
}

It's not quite as sensible as boost::numeric::bounds, which defines both lowest and highest, but whatever. Baby steps. We'll get there someday.


abomination and astonishment

Did you know that in C, a function protoype of the form:

void f();

...declares a function f that can accept any number of arguments of any type?

In other words, the following program is legal C:

#include <stdio.h>

void f();

int main() {
  /* yup, this 'f' is that 'f' */
  f("really?", 42, 3.999);
  return 0;
}

void f(char *str, int i, double d) {
  printf("%s %d %f\n", str, i, d);
}

To declare a function that takes no arguments, you'd use:

void f(void);

So "nothing" means "anything", and void means "nothing". Ugh.

Apparently, C didn't even have the f(void) notation until the first ANSI C in 1989. Stroustrup actually introduced this notation in the early 80's during the development of C with Classes (which later became C++).

Thankfully, Stroustrup followed an empirical approach to language design. When his users complained that his new f(void) was awkward and that C's old f() was misleading, he listened, despite reservations about introducing too many differences with C:

"It took support from both Doug McIlroy and Dennis Ritchie for me to build up the courage to make this break from C. Only after they used the word abomination about f(void) did I dare give f() the obvious meaning." D&E, p.41

So in C++, f() means exactly what you'd expect: a function f that takes no parameters.

Another win for the principle of least astonishment. Now if we could just get rid of that whole "variables default to int" thing.


if, switch, and RAII

I ran into some C++ code yesterday that made me wince. It looked something like this snippit (code obfuscated to protect the guilty innocent):

int result = -1;
switch (foo) {
  case FOO_VALUE_A:
  {
    int i = foo();
    bar(i);
    result = baz(i);
  }
  case FOO_VALUE_B:
  {
    int j = qux();
    quxx(j);
    result = quxx(j);
  }
  // (and so on)
}

In order to declare new variables inside a case of a switch, we need to use curly braces to establish some scope for those variables to hang out in. But (is your spidey sense tingling yet?) in this example, those little scopes might almost make you overlook the fact that the switch is completely devoid of any breaks or returns, which means we'll execute multiple cases serially. That's probably not desired behavior.

This is why I hate switch as a language construct.

One way to think of it is like this: switches have the same problem that raw pointers have. A raw pointer needs to be manually deleted at some point after it has been allocated. If you forget to perform this final action, the memory will leak from a localized region of time to future regions of time:

void foo() {
  Bar *b = new Bar();
  // ...
  delete b;
}

A std::unique_ptr, on the other hand, ties the deletion of that memory to a specific scope, ensuring that you won't leave that scope without deallocating your resource:

void foo() {
  std::unique_ptr<Bar> b(new Bar());
  // ...
}    // b goes out of scope, memory automatically deleted

Now, instead of thinking of memory leaking from one interval of time to another, think of control flow leaking from one area of code to another.

Using if helps ensure that your control flow won't leak from one condition to another. You don't need to worry about "deallocating" the condition when you reach the end of it. But if you forget to break after every case, you'll leak. You'll leak control.

Sure, switch can be used responsibly. So can raw pointers. If your last name is Kernighan or Ritchie, go nuts. Throw in some gotos, while you're at it. For most people, however, switch is a bad construct.


foreach a jolly good fellow

C++11 has a new way of doing for loops. Well, "new" might be a bit subjective, seeing as how Perl has had foreach since like 1994. But whatever. The standards committee took their time, attended a lot of meetings in exotic locations [Indiana], and made sure that the new range-based for loops wouldn't turn into another exception specification. The result is a dramatic improvement over what C++03 offered:

std::set<char> letters = {'o', 'm', 'g'};
for (char c: letters) {  // range-based for loop
  std::cout << c << '\n';
}

But how does it work?

Under the covers, when the compiler sees a statement of the form:

for (range-declaration: expression) statement

...it translates it into something like:

{
  auto && __range = range-init;
  for (auto __begin = begin-expr, __end = end-expr;
       __begin != __end;
       ++__begin) {
    range-declaration = *__begin;
    statement
  }
}

range-init is usually ( expression ) unless we're iterating over an initializer list, in which case it's just the raw list.

If the type of expression is an array, then begin-expr is __range and end-expr is __range + __bound, where __bound is the size of the array.

char letters[] = {'o', 'm', 'g'};
for (char c: letters) {  // works for arrays, too
  std::cout << c << '\n';
}

Otherwise, begin-expr is std::begin(__range) and end-expr is std::end(__range).

Iterating over custom types

std::begin and std::end look for begin/end methods that expose iterators, so you can iterate over custom types as long as they're well-behaved:

#include <iostream>
#include <algorithm>
#include <memory>

template <typename T>
class Foo {
 public:
  Foo(const std::initializer_list<T> &init) :
    m_size(init.size()),
    m_data(new T[init.size()]) {
    std::copy(init.begin(), init.end(), m_data.get());
  }
  // expose begin/end for iteration support
  T* begin() { return m_data.get(); }
  T* end() { return m_data.get() + m_size; }
 private:
  size_t m_size;
  std::unique_ptr<T[]> m_data;
};

int main() {
  Foo<int> foo = {3, 1, 4, 1, 5, 9};
  for (auto c: foo) {
    std::cout << c << '\n';
  }
}

Iterating over initializer lists

std::initializer_list exposes begin and end methods, so it can be iterated over:

for (auto c: {3, 1, 4, 1, 5, 9}) {
  std::cout << c << '\n';
}

squirrely operators

Sometimes C++ programmers come up with...unexpected...solutions to problems.

I'm not talking about IOCCC submissions, which are frighteningly awesome. What I'm talking about are three (arguably) useful idioms that you simply don't see very often:

 

Even if these "operators" are a little too squirrely for you, it's useful to be able to recognize them when you encounter them in the wild. Let's look at them one by one:

Goes-to (-->) operator

The first operator is --> (aka "goes-to"). It might appear in a situation like this:

int x = 100;
while (x --> 0) {
   std::cout << x << std::endl;
}

What does that code do? It displays the numbers 99 through 0.

If you look at any C++ operator precedence chart, you won't see --> anywhere. That's because the "goes-to" operator isn't really an operator. It's two operators (-- and >) combined with some unfortunate spacing. How cute.

int x = 100;
while (x-- > 0) {    // with normal spacing
   std::cout << x << std::endl;
}

Verdict: It's a little ironic that an expression that actually looks like what it does could be harder for programmers to parse than a more typical expression. That combined with the fact that there's nothing functionally different about this syntax relegates --> to the domain of "geek party tricks".

Bang-bang (!!) operator

The !! (aka bang-bang or double-bang operator) is just two unary negation operators applied in quick succession. It might look like this:

Foo f;
if (!!f) {
   // ...
}

This operator is usually used to coerce a boolean value out of some type, although it's been said that !! was used in C to map arbitrary numbers to 0 or 1 for use as indices into arrays of size two.

Wait a minute, you shout angrily. What's wrong with operator bool? Well, adding a generic conversion to bool can add unexpected bonus behavior to a type:

#include <iostream>

class Foo {
   bool m_b;
public:
   explicit Foo(bool b) : m_b(b) {}
   operator bool() const { return m_b; }
};

int main() {
   Foo f(true);
   if (!f) std::cout << "f is false" << std::endl; // sure
   if (f) std::cout << "f is true" << std::endl;   // makes sense
   int i = f;              // wait a minute
   f << 1;                 // what?
   if (f < Foo(false)) ... // um...
}

All of those expressions involving a Foo are completely valid, given that Foo has an operator bool.

So how can we get our conversion to bool to be a little less promiscuous? One answer involves overloading operator! and using !!:

#include <iostream>

class Foo {
   bool m_b;
public:
   explicit Foo(bool b) : m_b(b) {}
   bool operator!() const { return !m_b; }
};

int main() {
   Foo f(true);
   if (!f) std::cout << "f is false" << std::endl; // sure
   if (!!f) std::cout << "f is true" << std::endl; // also works
   int i = f;              // error
   f << 1;                 // error
   if (f < Foo(false)) ... // error
}

Verdict: Sure, !! is cryptic, but it's also arguably more useful than -->. It's certainly more concise than a static_cast, it can compress numbers down to 0/1 values easily, and it's one way of implementing a safe bool.

Decay (+) operator

In a comment on a stackoverflow question on array decaying, the prolific litb pointed out an unusual use for + operator: it can be used to force types to "decay" (i.e. be promoted) to pointers.

This gruesome operator could be useful if you're dealing with templates and you need a reference to a pointer:

// suppose we're given this method:
template <typename T>
void f(T * const & arg) {
   // ...
}

void g() {}

int main() {
   int arr[] = {3, 1, 4, 1, 5, 9};
   f(arr); // error: no matching function call
   f(+arr); // works!

   f(g); // error: no matching function call
   f(+g); // works!
}

Verdict: I'm torn on this one. It's obscure, but it's arguably nicer than a static_cast to some hard-to-parse type.


« Previous Page -- Next Page »