Intro

Well, the time has finally come to talk about AUTO_DATA_CONVERT. If you've ever looked around in the config files for YAYA, you may be familiar with this. Or, if you've ever used a SHIORI event that talks about separating information by "byte 1" characters, or 0x01. Or maybe, if you've updated your YAYA to the latest versions and had to turn this option back on, lest some parts your ghost break. So what is it? Why does it break things? What can you do about it? Let's get into that.

Index

  1. SHIORI Events
  2. What AUTO_DATA_CONVERT does
  3. Without AUTO_DATA_CONVERT
  4. 0x01 and commas
  5. Ok, so just turn it off?
  6. reference.raw
  7. Conclusion

SHIORI Events

First, we need to talk about SHIORI events. In short, when a SHIORI event is sent to the ghost from the baseware, all of the information is passed along as strings separated by linebreaks. This includes reference information! Here is an example of a SHIORI event.

// request
GET SHIORI/3.0
Charset: UTF-8
Sender: SSP
SenderType: internal
SecurityLevel: local
ID: OnKeyPress
Reference0: t
Reference1: 84
Reference2: 1
Reference3: 0
Reference4: 


// response (Execution time : 16[ms])
SHIORI/3.0 200 OK
Sender: AYA
Charset: UTF-8
Value: \0\s[0]Hey it's some dialogue

In the example above, you can see that when the request is made, 5 references are sent. Most of them are numbers. However, all of this information is simply sent as a string. Why is that important?

The reason is that you can't do math operations on strings. So, when YAYA breaks these down into reference variables for you, it will attempt to convert them to numbers if at all possible. However, the way it does this depends on your settings, and the difference is critical.

What AUTO_DATA_CONVERT does

Here's where things get messy! AUTO_DATA_CONVERT is... I'm not sure how to describe it, exactly. I'm truly not sure why it exists! I think it may have been put into place to make YAYA behave like AYA, to help people who were converting? But don't quote me on that, I haven't tested anything in AYA for a long time so I'm not certain. Anyways, as I said, when YAYA breaks down the data into references, it will attempt to convert the information to integers if it can. So if the data can be a number, it will be a number.

Ok, great! Sounds nice! ... Except. Sometimes, you don't want the data to be a number. Here's a couple of examples I have personally run into.

When setting up a weather system in my ghosts (modified from code by Yuyudev), one of the ways the user could find their location was to put in their zip code. All fine and dandy, easy to deal with. They should all be 5 numbers, no problem! The issue is, some zip codes start with a leading 0. So for example, a zip code could be 01234. Well, that's fine, that's still 5 digits, right?

However, if you convert 01234 to an integer, integers don't have leading 0s. 01234 as an integer is 1234. That isn't a valid zip code. I had to add in extra code to deal with that edge case. What a pain.

A second example. In my ghost S the Skeleton, I wanted to add name checks to see if the user put in 0825 as a name, since that number is relevant to him. Ah, but look! Another leading 0! I had a heck of a time finding out why my check wasn't working, until I realized that the data I was receiving was the integer 825 instead of the string 0825. What a pain!

A third example, not one I have had happen to me but I saw someone else run into it. They wanted the user to put in a fake phone number for an event in their ghost, but for some reason the number was coming out all wrong. It's because the base YAYA code was attempting to convert it to an integer, but that integer went over the 32 bit integer limit! What a confusing bug to run into; numbers from the input box coming out as a totally different number.

All of these have one thing in common. If they had been left alone as a string, they would have been fine, but when attempting to convert them to an integer some information was lost. And because this conversion happens in the base dic files for YAYA, you can't get your hands on that data before it's converted!

Without AUTO_DATA_CONVERT

The behavior without AUTO_DATA_CONVERT is a little different. The problem with AUTO_DATA_CONVERT and numbers is that it is simply too vigorous. Without AUTO_DATA_CONVERT, YAYA uses TOAUTOEX to convert the data, which has an additional check to ensure that no data is lost when converting. The check is simple; if the converted number can be converted back to a string again, and is still the same as the original, then it will go ahead with the conversion because no data will be lost. Otherwise, it leaves the data as a string.

This resolves all of the above issues neatly!

0x01 and commas

One more thing that you should know about AUTO_DATA_CONVERT before we move on. In addition to converting strings to numbers when possible, it also converts the byte 0x01 to a comma. What does that mean?

Well, 0x01 is what machine translation on Ukadoc usually calls a "byte-1" character. It's a character that the user can't really type, and as a result is super handy as a delimiter for splitting up arrays of data. Especially with simple arrays! In fact, YAYA as SHIORI has a special setup for this, where you can write C_BYTE1 and it will place a 0x01 character for you. Really useful for all sorts of things.

So useful, that some SHIORI events, such as OnBIFFComplete, use 0x01 to split up data! In particular, this is the event for successfully checking for emails, and reference7 contains the headers for each email. Trying to split up this information with commas would be a disaster! Email headers often have commas in them, so if you use commas to split up the array, then you won't be able to reliably tell where one entry stops and another begins!

... Are you seeing where this is going?

Yeah. AUTO_DATA_CONVERT changes all those 0x01 characters to commas. You can't reliably tell where one entry stops and another begins.

So, you can see the problem.

Ok, so just turn it off?

Yes! Plain and simple, yes. I'll be honest with you, I have no idea what the benefit of having AUTO_DATA_CONVERT on would be. If someone knows, please tell me. If you're converting from AYA, you might want to check and see if there's anything that relies on AUTO_DATA_CONVERT, but I'm really not sure.

There is one snag when turning AUTO_DATA_CONVERT off, and that is, if you have code set up that already relies on that behavior you will have to adjust it. For example, if you had set up email header code that attempted to break down the headers based on commas, you would need to change it to use C_BYTE1 instead. It shouldn't be too big of an issue, but you should definitely double check before converting, since it may introduce subtle bugs.

reference.raw

There is one saving grace if you don't want to turn off AUTO_DATA_CONVERT, and that is reference.raw. reference.raw is what it sounds like; the raw data from the references, without conversion. Unlike the other references, this is only available as an array. So you can't write reference.raw7, you have to write reference.raw[7]. But if you need to keep AUTO_DATA_CONVERT on, you can use this for the edge cases mentioned above whenever you need access to the raw data as a string. It's very helpful.

reference.raw[0] is reference0 but without conversion, reference.raw[1] is reference1 without conversion, and so on and so forth.

Conclusion

What a headache! But not as much of a headache as I originally thought. If you had read the original version of this guide, then just know, at some point some wires got crossed in my head and I mixed up having AUTO_DATA_CONVERT off with reference.raw, and thought all of the data would be strings unless you converted it. Thankfully, this isn't the case! What a nightmare that would have been, truly.

The reason I am writing about this is that in 2022, the YAYA as SHIORI dic files updated to have AUTO_DATA_CONVERT off by default, when before it was on by default. So, anyone updating their YAYA may run into issues. Additionally, I plan to update my templates to match, since I want their base YAYA files to match the originals (except for translations).

I think it would be best for those making free code to turn AUTO_DATA_CONVERT off, or at least plan for their code to be used in environments without this feature. That way, anyone can use the code as-is without having to worry about this issue. That's my recommendation, anyways.

Hopefully this helps to explain things!