Скасувати повторне копіювання рядків


33

Вступ

Давайте спостерігатимемо наступний рядок:

AABBCCDDEFFGG

Ви можете бачити, що кожен лист дублювався , крім букви E. Це означає, що лист Eбуло скопійовано . Отже, єдине, що нам потрібно зробити тут, - це повернути цей процес, який дає нам наступний не дублюється рядок:

AABBCCDDEEFFGG

Візьмемо більш складний приклад:

AAAABBBCCCCDD

Ви можете бачити, що існує нерівномірна кількість послідовних B, так що це означає, що один з BBкопій було знято з дублікату. Нам потрібно лише скасувати повторне копіювання цього листа, який дає нам:

AAAABBBBCCCCDD


Змагання

З огляду на не порожній дебютований рядок, що складається лише з алфавітних символів (або лише великих літер, або лише малих літер), поверніть невідтворений рядок. Ви можете припустити, що в рядку завжди буде щонайменше один дебюльований символ.


Тестові справи

AAABBBCCCCDDDD    -->    AAAABBBBCCCCDDDD
HEY               -->    HHEEYY
AAAAAAA           -->    AAAAAAAA
N                 -->    NN
OOQQO             -->    OOQQOO
ABBB              -->    AABBBB
ABBA              -->    AABBAA

Це , тому найкоротше дійсне подання в байтах виграє!


@ mbomb007 Так, це призведе до AABBBB.
Аднан

1
Я не впевнений, що розумію виклик. Чому ABBBкарта робить AABBBB, ні AABBBBBB?
Денніс

2
@Dennis Якщо ви розділите кожну групу символів в групи 2, ви отримаєте наступне: A BB B. Символи, які не є парними (і, отже, не дублюються), потрібно дублювати, в результаті AA BB BBчого стає невідтвореним рядком.
Аднан

8
Отже: Переконайтеся, що кожен запуск символів має парну кількість елементів, додаючи до виконання щонайменше один елемент?
Божевільний фізик

1
@MadPhysicist Так, це правильно
Аднан

Відповіді:


20

MATL, 7 bytes

Y'to+Y"

Try it online! Or verify all test cases.

Let's take 'ABBA' as example input.

Y'   % Implicit input. Run-length decoding
     % STACK: 'ABA', [1 2 1]
t    % Duplicate top of the stack
     % STACK: 'ABA', [1 2 1], [1 2 1]
o    % Modulo 2
     % STACK: 'ABA', [1 2 1], [1 0 1]
+    % Add, element-wise
     % STACK: 'ABA', [2 2 2]
Y"   % Run-length encoding. Implicit display
     % STACK: 'AABBAA'


8

Perl, 16 bytes

15 bytes of code + -p flag.

s/(.)\1?/$1$1/g

To run it:

perl -pe 's/(.)\1?/$1$1/g' <<< 'HEY'

7

Haskell, 36 bytes

u(a:b:c)=a:a:u([b|a/=b]++c)
u x=x++x

Usage example: u "OOQQO" -> "OOQQOO".

If the string has at least 2 elements, take two copies of the first and append a recursive call with

  • the second element an the rest if the first two elements differ or
  • just the rest

If there are less than two elements (one or zero), take two copies of the list.


6

Brachylog, 17 bytes

@b:{~b#=.l#e,|}ac

Try it online!

Explanation

Example input: "ABBB"

@b                  Blocks: Split into ["A", "BBB"]
  :{          }a    Apply the predicate below to each element of the list: ["AA", "BBBB"]
                c   Concatenate: "AABBBB"

    ~b#=.             Output is the input with an additional element at the beginning, and
                        all elements of the output are the same (e.g. append a leading "B")
        .l#e,         The length of the Output is an even number
             |        Or: Input = Output (i.e. do nothing)


4

JavaScript (ES6), 37 30 bytes

Saved 7 bytes by using the much more efficient '$1$1' like [other] [answers] did

s=>s.replace(/(.)\1?/g,'$1$1')

Test cases


4

Mathematica, 41 bytes

s=StringReplace;s[s[#,a_~~a_->a],b_->b~~b]&

Unnamed function that inputs a string and outputs a string. Completely deduplicate then completely undeduplicate. Not real short, but I couldn't do better for now.


4

Befunge 98, 24 bytes

#@~#;:::#@,~-:!j;$,;-\,;

Try it Online!

$ can be easily replaced with -, and the 2nd @ with ;.

I think this can be golfed further due to the - at the beginning of both -, (or $, above) and -\,.

How?

Stack notation:  bottom [A, B, C, D] top

#@~     Pushes the first character onto the stack (C henceforth) and ends if EOF
#;      No-op to be used later
:::     Now stack is [C, C, C, C]

#@,~    Prints C, and if EOF is next (odd consecutive Cs), prints again and ends
        Lets call the next character D

-       Now stack is [C, C, C-D]
:!j;    If C == D, go to "$," Else, go to "-\,"

===(C == D)===

$,      C == D (i.e. a pair of Cs) so we discard top and print C (Stack is now [C])
;-\,;   Skipped, IP wraps, and loop starts again

===(C != D)===

-       Stack is [C, C-(C-D)]  By expanding: [C, C - C + D] or just [C, D]
\,      Prints C (Stack is now [D])

;#@~#;  This is skipped, because we already read the first character of a set of Ds,
        and this algorithm works by checking the odd character in a set of
        consecutive similar characters. We already read D, so we don't
        need to read another character.

3

Java 7, 58 bytes

String c(String s){return s.replaceAll("(.)\\1?","$1$1");}

Ungolfed:

String c(String s){
  return s.replaceAll("(.)\\1?", "$1$1");
}

Test code:

Try it here.

class M{
  static String c(String s){return s.replaceAll("(.)\\1?","$1$1");}

  public static void main(String[] a){
    System.out.println(c("AABBCCDDEFFGG"));
    System.out.println(c("AAAABBBCCCCDD"));
    System.out.println(c("AAABBBCCCCDDDD"));
    System.out.println(c("HEY"));
    System.out.println(c("AAAAAAA"));
    System.out.println(c("N"));
    System.out.println(c("OOQQO"));
    System.out.println(c("ABBB"));
    System.out.println(c("ABBA"));
  }
}

Output:

AABBCCDDEEFFGG
AAAABBBBCCCCDD
AAAABBBBCCCCDDDD
HHEEYY
AAAAAAAA
NN
OOQQOO
AABBBB
AABBAA

2

PHP, 65 bytes, no regex

while(""<$c=($s=$argv[1])[$i])if($c!=$s[++$i]||!$k=!$k)echo$c.$c;

takes input from command line argument. Run with -r.

regex? In PHP, the regex used by most answers duplicates every character. would be 44 bytes:

<?=preg_replace("#(.)\1?#","$1$1",$argv[1]);

2

Brain-Flak 69 Bytes

Includes +3 for -c

{((({}<>))<>[({})]<(())>){((<{}{}>))}{}{(<{}{}>)}{}}<>{({}<>)<>}<>

Try it Online!

Explanation:

Part 1:
{((({}<>))<>[({})]<(())>){((<{}{}>))}{}{(<{}{}>)}{}}<>

{                                                  }   # loop through all letters
 (   {}     [ {} ]<(())>){((<{}{}>))}{}                # equals from the wiki   
                                                       # but first:
  ((  <>))<>                                           # push the top letter on the other 
                                                       # stack twice  
             (  )                                      # push the second letter back on
                                       {        }      # if they were equal:
                                        (<    >)       # push a 0 to exit this loop
                                          {}{}         # after popping the 1 from the 
                                                       # comparison and the next letter
                                                       # (the duplicate)
                                                 {}    # pop the extra 0
                                                    <> # switch stacks

Part 2 (at this point, everything is duplicated in reverse order):
{({}<>)<>}<>

{        }   # for every letter:
 ({}<>)      # move the top letter to the other stack
       <>    # and switch back
          <> # Finally switch stacks and implicitly print


1

V 10 bytes

ͨ.©±½/±±

TryItOnline

Just a find and replace regex like all of the rest in the thread. The only difference is that I can replace anything that would require a \ in front of it with the character with the same ascii value, but the high bit set. (So (, 00101000 becomes ¨, 10101000)


1

Perl 6, 17 bytes

s:g/(.)$0?/$0$0/

with -p command-line switch

Example:

$ perl6 -pe 's:g/(.)$0?/$0$0/' <<< 'AAABBBCCCCDDDD
> HEY
> AAAAAAA
> N
> OOQQO
> ABBB
> ABBA'
AAAABBBBCCCCDDDD
HHEEYY
AAAAAAAA
NN
OOQQOO
AABBBB
AABBAA

1

Racket 261 bytes

(let((l(string->list s))(r reverse)(c cons)(e even?)(t rest)(i first))(let p((l(t l))(ol(c(i l)'())))
(cond[(empty? l)(list->string(if(e(length ol))(r ol)(r(c(i ol)ol))))][(or(equal?(i ol)(i l))(e(length ol)))
(p(t l)(c(i l)ol))][(p(t l)(c(i l)(c(i ol)ol)))])))

Ungolfed:

(define (f s)
  (let ((l (string->list s)))
    (let loop ((l (rest l))
               (ol (cons (first l) '())))
      (cond
        [(empty? l)
         (list->string(if (even? (length ol))
                          (reverse ol)
                          (reverse (cons (first ol) ol))))]
        [(or (equal? (first ol) (first l)) 
             (even? (length ol)))
         (loop (rest l) (cons (first l) ol))]
        [else
         (loop (rest l) (cons (first l) (cons (first ol) ol)))] ))))

Testing:

(f "ABBBCDDEFFGGG")

Output:

"AABBBBCCDDEEFFGGGG"

1

05AB1E, 10 bytes

.¡vy¬ygÉ×J

Try it online!

Explanation

.¡           # split string into groups of the same char
  v          # for each group
   y         # push the group
    ¬        # push the char the group consists of
     yg      # push the length of the group
       É     # check if the length of the group is odd
        ×    # repeat the char is-odd times (0 or 1)
         J   # join to string

1

Python3, 102 94 bytes

from collections import*
lambda s:"".join(c*(s.count(c)+1&-2)for c in OrderedDict.fromkeys(s))

Thanks to xnor for saving 8 bytes! -> bithack.


This doesn't keep the letters in the right order.
xnor

@xnor Thanks for mentioning! Fixed.
Yytsi

Looks good. You can write the expression x+x%2 as x&-2.
xnor

@xnor I tried s.count(c)&-2 and it returned an empty string... :/ Any thoughts?
Yytsi

1
Oh, you're right and I made a mistake. I think x+1&-2 should do it. Evens go to themselves and odds round up to evens.
xnor

1

R, 81 bytes

r=rle(el(strsplit(scan(,""),"")));cat(do.call("rep",list(r$v,r$l+r$l%%2)),sep="")

Reads a string from stdin, splin into vector of characters and perform run-length encoding (rle). Subsequently repeat the each values from the rle, the sum of the lengths and the lengths mod 2.

If we can read input separated by space (implicitly as a vector/array of characters) then we can skip the splitting part and the program reduces to 64 bytes:

r=rle(scan(,""));cat(do.call("rep",list(r$v,r$l+r$l%%2)),sep="")

1

><> (Fish) 39 bytes

0v ;oo:~/:@@:@=?!voo
 >i:1+?!\|o !:  !<

Pretty sure this can be golfed a lot using a different technique.

It takes an input and compares against the current stack item, if it's different it'll print the first stack item twice, if the same it prints them both.

The stack when empty gets supplied with a 0 which prints nothing so can be appended on whenever.


1

Pyth, 15 bytes

Vrz8p*+hN%hN2eN

Verify all test cases here.

Thanks to Luis Mendo for the methodology.

Explanation

Vrz8p*+hN%hN2eN    z autoinitializes to the input
 rz8               run-length encode the input, returned as list of tuples (A -> [[1,"A"]])
V                  for every element N in this list
      +hN          add the head element of N (the number in the tuple)
         %hN2      to the head element of N mod 2
     *       eN    repeat the tail element of N that many times (the letter in the tuple)
    p              print repeated character without trailing newline

As is often the case, I feel like this could be shorter. I think there should be a better way to extract elements from the list than what I am using here.


1

PowerShell, 28 bytes

$args-replace'(.)\1?','$1$1'

Try it online! (includes all test cases)

Port of the Retina answer. The only points of note are we've got $args instead of the usual $args[0] (since the -replace will iterate over each item in the input array, we can golf off the index), and the '$1$1' needs to be single quotes so they're replaced with the regex variables rather than being treated as PowerShell variables (which would happen if they were double-quote).


1

C, 67 bytes

i;f(char*s,char*d){i=*s++;*d++=i;*d++=i;*s?f(i-*s?s:++s,d):(*d=0);}

Call with:

int main()
{
    char *in="AAABBBCCCCDDDD";
    char out[128];
    f(in,out);
    puts(out);
}

Використовуючи наш веб-сайт, ви визнаєте, що прочитали та зрозуміли наші Політику щодо файлів cookie та Політику конфіденційності.
Licensed under cc by-sa 3.0 with attribution required.