首页 文章

如何用C#中的Swig包装UTF-8编码的C std :: strings?

提问于
浏览
3

我的问题几乎与this question相同,只是链接的问题涉及char *,而我'm using std::string in my code. Like the linked question, I' m也使用C#作为我的目标语言 .

我有一个用C编写的课程:

class MyClass
{
public:
    const std::string get_value() const; // returns utf8-string
    void set_value(const std::string &value); // sets utf8-string
private:
    // ...
};

这个由SWIG在C#中包含如下:

public class MyClass
{
    public string get_value();
    public void set_value(string value);
}

SWIG为我做了一切,除了它在调用MyClass期间没有对utf16进行utf16字符串转换 . 如果它们在ASCII中可表示,我的字符串会很好,但如果我尝试通过“set_value”和“get_value”在往返中传递带有非ascii字符的字符串,我最终会得到难以理解的字符 .

如何在C#中使用SWIG包装UTF-8编码的C字符串?注:我正在使用std :: string,而不是std :: wstring,而不是char * .

SWIG sourceforge site上有一个部分解决方案,但它处理char *而不是std :: string,它使用(可配置的)固定长度缓冲区 .

1 回答

  • 3

    在链接的Code Project文章的David Jeske的帮助(阅读:天才!)中,我终于能够回答这个问题了 .

    你需要在C#库中使用这个类(来自David Jeske的代码) .

    public class UTF8Marshaler : ICustomMarshaler {
        static UTF8Marshaler static_instance;
    
        public IntPtr MarshalManagedToNative(object managedObj) {
            if (managedObj == null)
                return IntPtr.Zero;
            if (!(managedObj is string))
                throw new MarshalDirectiveException(
                       "UTF8Marshaler must be used on a string.");
    
            // not null terminated
            byte[] strbuf = Encoding.UTF8.GetBytes((string)managedObj); 
            IntPtr buffer = Marshal.AllocHGlobal(strbuf.Length + 1);
            Marshal.Copy(strbuf, 0, buffer, strbuf.Length);
    
            // write the terminating null
            Marshal.WriteByte(buffer + strbuf.Length, 0); 
            return buffer;
        }
    
        public unsafe object MarshalNativeToManaged(IntPtr pNativeData) {
            byte* walk = (byte*)pNativeData;
    
            // find the end of the string
            while (*walk != 0) {
                walk++;
            }
            int length = (int)(walk - (byte*)pNativeData);
    
            // should not be null terminated
            byte[] strbuf = new byte[length];  
            // skip the trailing null
            Marshal.Copy((IntPtr)pNativeData, strbuf, 0, length); 
            string data = Encoding.UTF8.GetString(strbuf);
            return data;
        }
    
        public void CleanUpNativeData(IntPtr pNativeData) {
            Marshal.FreeHGlobal(pNativeData);            
        }
    
        public void CleanUpManagedData(object managedObj) {
        }
    
        public int GetNativeDataSize() {
            return -1;
        }
    
        public static ICustomMarshaler GetInstance(string cookie) {
            if (static_instance == null) {
                return static_instance = new UTF8Marshaler();
            }
            return static_instance;
        }
    }
    

    然后,在Swig的“std_string.i”中,在第24行替换此行:

    %typemap(imtype) string "string"
    

    用这一行:

    %typemap(imtype, inattributes="[MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(UTF8Marshaler))]", outattributes="[return: MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(UTF8Marshaler))]") string "string"
    

    在第61行,替换此行:

    %typemap(imtype) const string & "string"
    

    用这一行:

    %typemap(imtype, inattributes="[MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(UTF8Marshaler))]", outattributes="[return: MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(UTF8Marshaler))]") string & "string"
    

    瞧,一切正常 . 阅读链接的文章,以便更好地了解其工作原理 .

相关问题