Tags: binary, constructed, function, matlab, occurrences, pdf, probabilitydensity, programming, situation, sorting, strings

Sorting binary strings

On Programmer » Matlab

13,864 words with 9 Comments; publish: Wed, 30 Apr 2008 21:14:00 GMT; (200138.67, « »)

My situation is this: I have constructed a probability

density function (PDF) for occurrences of binary strings like

'0000'

'0001'

'0011'

and so forth. The PDF is a vector that is indexed using the

bin2dec function (plus one), so that

'0000' corresponds to index 1 (i.e. bin2dec('0 0 0 0') + 1 = 1)

'0001' corresponds to index 2 (bin2dec('0001')+1 = 2)

'0111' corresponds to index 8 (bin2dec('0111')+1 = 8)

and so on for all 2^4 = 16 possibilities.

I would like to order my PDF vector so that '0000' corresponds

to PDF(1), '0001' is PDF(2), '0010' is PDF(3), etc., meaning

that I want to arrange the ordering so that strings with only

1 event (a single 1 in the string) have lower corresponding indices than

strings with 2 events (i.e. '0011'), which in turn have lower indices

than strings with 3 events and so forth.

The problem is that a string like '1000' will have a higher index

(using bin2dec) than '0011' because the bin2dec function returns

a larger value.

I'm looking for a way to reshuffle the indices of my PDF(:) to

reflect this desired ordering. I'm just not skilled in sorting

char string variables, if indeed there is an easy way to do it.

Let me outline all the possibilities for the simple case of binary

strings of length 3 (I need to do this for much larger strings):

string bin2dec+1 index desired index

---

'000' 1 1

'001' 2 2

'010' 3 3

'100' 5 4

'011' 4 5

'101' 6 6

'110' 7 7

'111' 8 8

In this case there is only shuffle necessary.

Thanks,

JB

All Comments

Leave a comment...

  • 9 Comments
    • Joe Blow wrote:

      >

      > My situation is this: I have constructed a probability

      > density function (PDF) for occurrences of binary strings like

      > '0000'

      > '0001'

      > '0011'

      > and so forth. The PDF is a vector that is indexed using the

      > bin2dec function (plus one), so that

      > '0000' corresponds to index 1 (i.e. bin2dec('0 0 0 0') + 1 = 1)

      > '0001' corresponds to index 2 (bin2dec('0001')+1 = 2)

      > '0111' corresponds to index 8 (bin2dec('0111')+1 = 8)

      > and so on for all 2^4 = 16 possibilities.

      > I would like to order my PDF vector so that '0000' corresponds

      > to PDF(1), '0001' is PDF(2), '0010' is PDF(3), etc., meaning

      > that I want to arrange the ordering so that strings with only

      > 1 event (a single 1 in the string) have lower corresponding indices

      > than

      > strings with 2 events (i.e. '0011'), which in turn have lower

      > indices

      > than strings with 3 events and so forth.

      > The problem is that a string like '1000' will have a higher index

      > (using bin2dec) than '0011' because the bin2dec function returns

      > a larger value.

      > I'm looking for a way to reshuffle the indices of my PDF(:) to

      > reflect this desired ordering. I'm just not skilled in sorting

      > char string variables, if indeed there is an easy way to do it.

      > Let me outline all the possibilities for the simple case of binary

      > strings of length 3 (I need to do this for much larger strings):

      > string bin2dec+1 index desired index

      > ---

      > '000' 1 1

      > '001' 2 2

      > '010' 3 3

      > '100' 5 4

      > '011' 4 5

      > '101' 6 6

      > '110' 7 7

      > '111' 8 8

      > In this case there is only shuffle necessary.

      > Thanks,

      > JB

      >

      For the first step of ypur problem:

      To determine the number of ones in a array of strings use something

      like

      sumone = sum(S == '1',2) ;

      in which S = ['000' ; '001' ; '010'; '011' ...] ;

      Jos

      #1; Wed, 30 Apr 2008 21:15:00 GMT
    • In article <pnXfe.19272$mp6.4048011.matlab.questionfor.info.twister.nyc.rr.com>, Joe Blow

      <joseph_blow16.matlab.questionfor.info.yahoo.com> wrote:

      > My situation is this: I have constructed a probability

      > density function (PDF) for occurrences of binary strings like

      > '0000'

      > '0001'

      > '0011'

      > and so forth. The PDF is a vector that is indexed using the

      > bin2dec function (plus one), so that

      > '0000' corresponds to index 1 (i.e. bin2dec('0 0 0 0') + 1 = 1)

      > '0001' corresponds to index 2 (bin2dec('0001')+1 = 2)

      > '0111' corresponds to index 8 (bin2dec('0111')+1 = 8)

      > and so on for all 2^4 = 16 possibilities.

      > I would like to order my PDF vector so that '0000' corresponds

      > to PDF(1), '0001' is PDF(2), '0010' is PDF(3), etc., meaning

      > that I want to arrange the ordering so that strings with only

      > 1 event (a single 1 in the string) have lower corresponding indices than

      > strings with 2 events (i.e. '0011'), which in turn have lower indices

      > than strings with 3 events and so forth.

      > The problem is that a string like '1000' will have a higher index

      > (using bin2dec) than '0011' because the bin2dec function returns

      > a larger value.

      > I'm looking for a way to reshuffle the indices of my PDF(:) to

      > reflect this desired ordering. I'm just not skilled in sorting

      > char string variables, if indeed there is an easy way to do it.

      > Let me outline all the possibilities for the simple case of binary

      > strings of length 3 (I need to do this for much larger strings):

      > string bin2dec+1 index desired index

      > ---

      > '000' 1 1

      > '001' 2 2

      > '010' 3 3

      > '100' 5 4

      > '011' 4 5

      > '101' 6 6

      > '110' 7 7

      > '111' 8 8

      > In this case there is only shuffle necessary.

      > Thanks,

      > JB

      --

      Hello JB,

      Here is a function which will generate binary strings in the order you

      desire. Within groups with equal numbers of 1-bits, ordinary binary

      number order prevails. I adapted it to your purposes from a routine I

      sent in March to another poster with a somewhat similar problem.

      %--

      function s = bineventsort(n)

      % This generates all 2^n possible binary string sequences

      % of n bits in order of ascending numbers of 1 bits. For

      % equal numbers of 1's they occur in ordinary binary

      % number sequence order. RAS - 5/10/05

      % Test n

      if floor(n) ~= n | n <= 0

      error('n must be a postive integer')

      end

      % Calculate Pascal's triangle

      p = 1;

      for i = 1:n

      p = [0 p] + [p 0];

      end

      % Generate the 2^n possible binary sequences

      x = zeros(2^n,n); % Allocate space for x

      i = 0;

      for m = 0:n % m is the number of 1's

      y = [zeros(1,n-m) ones(1,m)]; % Start with all the 1's to the right

      i = i + 1; x(i,:) = y; % Store it in x

      for j = 2:p(m+1) % Go through for all others with m 1's

      t = find(y); n1 = t(m); % Find the rightmost one

      t = find(~y); n0 = t(n1-m); % Find the rightmost zero left of that

      y(n0:n) = [1 0 y(n1+1:n) y(n0+2:n1)]; % Get next comb. using n1, n0

      i = i + 1; x(i,:) = y; % Store it in x

      end

      end

      % Convert the binary sequences to strings of '0' & '1' characters

      s = setstr(x+48);

      %--

      (Remove "xyzzy" and ".invalid" to send me email.)

      Roger Stafford

      #2; Wed, 30 Apr 2008 21:16:00 GMT
    • Joe Blow:

      <SNIP sorting binary strings...

      one of the many solutions

      % create some data

      b=dec2bin(0:16); % <- a string

      % the engine

      bs=sum(b,2); % <- sort char representation!

      [bs,bs]=sort(bs);

      bs=b(bs,:);

      % the result

      bs

      % ans =

      % 00000

      % 00001

      % 00010

      % 00100

      % 01000

      % 10000

      % 00011

      % 00101

      % 00110

      % 01001

      % 01010

      % 01100

      % 00111

      % 01011

      % 01101

      % 01110

      % 01111

      us

      #3; Wed, 30 Apr 2008 21:17:00 GMT
    • us wrote:

      >

      > Joe Blow:

      > <SNIP sorting binary strings...

      > one of the many solutions

      > % create some data

      > b=dec2bin(0:16); % <- a string

      > % the engine

      > bs=sum(b,2); % <- sort char representation!

      > [bs,bs]=sort(bs);

      > bs=b(bs,:);

      > % the result

      > bs

      > % ans =

      > % 00000

      > % 00001

      > % 00010

      > % 00100

      > % 01000

      > % 10000

      > % 00011

      > % 00101

      > % 00110

      > % 01001

      > % 01010

      > % 01100

      > % 00111

      > % 01011

      > % 01101

      > % 01110

      > % 01111

      Ah, but do you get the same answer if you <flipud> your input

      variable before doing the sort? My assumption was that numbers with

      equal numbers of bits set would be sorted by their binary value (e.g.

      1010>1001).

      I'm waiting 'till my lunch break before spending too much time on

      this one :)

      #4; Wed, 30 Apr 2008 21:18:00 GMT
    • Steve Amphlett wrote:

      >

      > Ah, but do you get the same answer if you <flipud> your input

      > variable before doing the sort? My assumption was that numbers

      > with

      > equal numbers of bits set would be sorted by their binary value

      > (e.g.

      > 1010>1001).

      > I'm waiting 'till my lunch break before spending too much time on

      > this one :)

      Yer tis:

      S=dec2bin(0:16);

      % First sort on number of bits

      x=sum(S-'0',2);

      [x,idx]=sort(x);

      S=S(idx,:);

      % Sort sub-sections on binary value

      a=find(diff(x));

      for b=1:length(a)-1

      c=a(b)+1:a(b+1);

      s=S(c,:);

      [d,d]=sort(bin2dec(s));

      S(c,:)=s(d,:);

      end

      #5; Wed, 30 Apr 2008 21:20:00 GMT
    • Steve Amphlett:

      <SNIP the usual himself: doubtful and hungry...

      > I'm waiting 'till my lunch break before spending too much time on

      this one...

      just an additional note

      the proposed snippet should always work - since you simply need to

      sort the bitstream-string

      % lets have a closer look

      % a char string made of <0>/<1>s

      b=dec2bin(0:16);

      % ...let's waste some cycles

      b=b(randperm(length(b)),:);

      bs=sortrows(b);

      % now

      bsum=sum(bs,2); % <- will turn bs into doubles

      % hence, bsum is the sum of 48s/49s

      % will NOT be unique -

      % ...BUT the order will stay within epochs of equal sums

      % ...after sorting

      % ...since the sum itself is only dependent on the nr of

      % ...zeros and ones, eg, scrutinize

      [bs,bs]=sort(bsum);

      [bsum bs]

      % which should make the code work for an array

      % of these problems

      just a note

      us

      #6; Wed, 30 Apr 2008 21:20:00 GMT
    • us wrote:

      >

      > Steve Amphlett:

      > <SNIP the usual himself: doubtful and hungry...

      >

      on

      > this one...

      > just an additional note

      > the proposed snippet should always work - since you simply need to

      > sort the bitstream-string

      > % lets have a closer look

      > % a char string made of <0>/<1>s

      > b=dec2bin(0:16);

      > % ...let's waste some cycles

      > b=b(randperm(length(b)),:);

      > bs=sortrows(b);

      > % now

      > bsum=sum(bs,2); % <- will turn bs into doubles

      > % hence, bsum is the sum of 48s/49s

      > % will NOT be unique -

      Which is why I used the same assuption as <Roger Stafford>:

      "Within groups with equal numbers of 1-bits, ordinary binary number

      order prevails." To do my second sorting phase.

      Mind you, the OP hasn't really told us what to do to resolve the

      non-uniqueness, so it's a bit academic.

      - Steve

      #7; Wed, 30 Apr 2008 21:22:00 GMT
    • In article <ef056f9.5.matlab.questionfor.info.webx.raydaftYaTP>, us <us.matlab.questionfor.info.neurol.unizh.ch> wrote:

      > just an additional note

      > SNIP

      > % hence, bsum is the sum of 48s/49s

      > % will NOT be unique -

      > % ...BUT the order will stay within epochs of equal sums

      > % ...after sorting

      > SNIP

      > us

      --

      us has revealed a property of 'sort' I was totally unaware of. The

      description of 'sort' promises: "for identical values in x, the location

      in the input vector or matrix determines location in the sorted list."

      This means that us's code and the first phase of Steve Amphlett's code

      retain the ordering possessed originally within groups with equal numbers

      of 1-bits. JB, you'd better toss out the stuff I sent you. Theirs is

      better.

      (Remove "xyzzy" and ".invalid" to send me email.)

      Roger Stafford

      #8; Wed, 30 Apr 2008 21:22:00 GMT
    • Roger Stafford wrote:

      > In article <ef056f9.5.matlab.questionfor.info.webx.raydaftYaTP>, us <us.matlab.questionfor.info.neurol.unizh.ch> wrote:

      >

      > --

      > us has revealed a property of 'sort' I was totally unaware of. The

      > description of 'sort' promises: "for identical values in x, the location

      > in the input vector or matrix determines location in the sorted list."

      > This means that us's code and the first phase of Steve Amphlett's code

      > retain the ordering possessed originally within groups with equal numbers

      > of 1-bits. JB, you'd better toss out the stuff I sent you. Theirs is

      > better.

      > (Remove "xyzzy" and ".invalid" to send me email.)

      > Roger Stafford

      Thank you all for your interest and valuable insights!

      JB

      #9; Wed, 30 Apr 2008 21:24:00 GMT